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Preface 



A new millennium has started and new tools, heirs of three decades of research 
and development in the Logic Programming paradigm, are bringing new solu- 
tions to cope with the increasing complexity of today’s computer systems. Com- 
putational logic in general and logic programming in particular will always play a 
key role in the understanding, formalizing, and development of complex software. 

ICLP2001 was the 17th International Conference on Logic Programming 
and continued a series of conferences initiated in Marseille, France, in 1982. This 
year ICLP was held in conjunction with CP 2001, the 7th International Con- 
ference on Principle and Practice of Constraint Programming. A coordinated 
program schedule and joint events were organized in order to maximize the in- 
teraction between these two neighboring communities. LOPSTR2001, the 11th 
international workshop on Logic-based Program Synthesis and Transformation 
was also co-located with ICLP 2001 and CP 2001 this year, bringing together 
a larger community to share novel research results. Seven satellite workshops 
were also associated to the conference and took place on the day following the 
conference. 

We received 64 papers, among which 23 were selected for presentation at 
the conference and inclusion in the conference proceedings. In addition to paper 
presentations, the conference program also included four invited talks and four 
tutorials. We chose this year to celebrate the founders of the logic programming 
field, namely Alain Colmerauer and Bob Kowalski, who are both celebrating their 
60th birthday. The other two invited talks were given by Patrick Cousot, pio- 
neer in the field of abstract interpretation, and Ashish Gupta, who presented his 
industrial experience on cross-enterprise databases at amazon.com and Tavant 
Technologies. The four tutorials were given by Eric Villemonte de la Clergerie, 
V.S. Subrahmanian, Kazunori Ueda, and Jan Wielemaker. 

I would like to thank all the authors of the submitted papers, the Program 
Committee members, and the referees for their time and efforts spent in the re- 
viewing process, the conference chair Tony Kakas and his team at the University 
of Cyprus for the excellent organization of the conference, and Toby Walsh, the 
CP 2001 program chair, for his constant cooperation and interaction. Last but 
not least, special thanks to Yoann Fabre at the University of Paris 6 for taking 
care of installing and maintaining the paper review system. 
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Solving the Multiplication Constraint 
in Several Approximation Spaces 



Alain Colmerauer 

Universites de la Mediterranee et de Provence 
Laboratoire d’Informatique de Marseille, CNRS 
13288 Marseille Cedex 9, France 
alain. colmerauer@univ-mrs .fr 

Given three intervals a, b, c in the real numbers and the multiplication con- 
straint z = xyAx€aAx€bAx€c, we are interested in establishing and 
justifying formulas for computing the smallest intervals a',b',c' which substi- 
tuted for a, b, c do not modify the set of solutions of the constraint. 

We study three cases : (1) the well-known case where a, b, c, a', b', c' are closed 
intervals, (2) the case where a, 6, c, o', 6', c' are intervals, eventually open or not 
bounded, (3) the case where a, 6, c, a' ,b' , d are intervals, eventually open or not 
bounded, whose lower and upper bounds, if their exist, are taken from a given 
finite set. 

For this we introduce the general notions of approximation space, of good 
relation, of extension and aggregation of relations and establish three properties 
which can be used for solving other constraints. 
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Is Logic Really Dead or Only Just Sleeping? 



Robert Kowalski 

Department of Computing, Imperial College, London, UK 
rak@doc .ic.ac.uk 

http: //www-lp . doc . ic . ac .uk/UserPages/staff /rak/rak.html 

There was a time when Logic was the dominant paradigm for human rea- 
soning. As George Boole put it around one hundred and fifty years ago, Logic 
was synonymous with the “Laws of Thought”. Later, for most of the latter half 
of the twentieth century, it was the mainstream of Artificial Intelligence. But 
then it all went wrong. Artificial Intelligence researchers, frustrated by the lack 
of progress, blamed many of their problems on the logic-based approach. They 
argued that humans do not reason logically, and therefore machines should not 
be designed to reason logically either. Other approaches began to make progress 
where Logic was judged to have failed - approaches that were designed to sim- 
ulate directly the neurological mechanisms of animal and human intelligence. 
Insect-like robots began to appear, and the beginning of a new Machine Intelli- 
gence was born. Logic seemed to be dieing - and to be taking Logic Programming 
(LP) with it. 

The possible death of Logic has important implications for LP, because ar- 
guably the main argument for LP is: 

— that LP is based on Logic, 

— that Logic is the foundation of human reasoning, and 

— that, therefore, LP is more human-oriented and user-friendly than computer 
languages developed mainly for machines. 

It is possible to quarrel with both premises of the argument. In particu- 
lar, the first premise that LP is based on Logic has been attacked by some LP 
researchers themselves, arguing that Logic Programs are better understood as 
inductive definitions. This alternative view of the foundations of LP has much 
technical merit, but it potentially undermines the main argument for LP. I will 
consider one way of rescuing the argument, by outlining how inductive definitions 
can be incorporated in a more general Logic of thinking, as part of a more com- 
prehensive observation- thought-action agent cycle. However, my main concern 
here is with attacks against the second premise of the argument: that Logic is 
fundamental to human thinking. These attacks include the old, familiar ones ad- 
vocating alternative symbolic approaches, most notably condition-action rules. 
They also include more recent ones advocating non-symbolic connectionist and 
situated intelligence approaches. I will examine some of these attacks and try to 
distinguish between those that are justified and those that are simply wrong. 

I will argue that, to address these attacks and to be in a better position to 
fight back. Logic and LP need to be put into place: Logic within the thinking 
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component of the observation-thought-action cycle of a single agent, and LP 
within the belief component of thought. In addition to LP, a complete model 
of Computation and Human Reasoning also needs: a logical goal component 
(which includes condition-action rules), other kinds of non-symbolic thinking, 
and a framework that includes other agents. 

I will argue that the observation-thought-action cycle provides a more realis- 
tic framework, not only for Logic as a descriptive theory of how humans actually 
think, but also for Logic as a prescriptive theory of how humans and computers 
can reason more effectively. With such a more realistic framework, even if Logic 
and LP might be only half awake today, they can at worst be only sleeping, to 
come back with renewed and more lasting vigour in the near future. 




Design of Syntactic Program Transformations 
by Abstract Interpretation 
of Semantic Transformations* 



Patrick Cousot 

Departement d’informatique, Ecole Normale Superieure, 

45 rue d’Ulm, 75230 Paris cedex 05, France 
Patrick . CousotOens . f r 
http: //www. di . ens . fr/~cousot/ 

Traditionally, static program analysis has been used for offline program trans- 
formation i.e. an abstraction of the subject program semantics is used to de- 
termine which syntactic transformations are applicable. A classical example is 
binding-time analysis before partial evaluation [4,5]. 

We present a new application of abstract interpretation to the formalization 
of source to source program transformations: 

— The semantic transformation is understood as an abstraction of the subject 
program semantics. The intuition is that the transformed semantics is an ap- 
proximation of the subject semantics because, most often, redundant elements 
of the subject semantics have been eliminated; 

— The correctness of the semantic transformation is expressed by an observa- 
tional abstraction. The intuition is that the subject and transformed semantics 
should be exactly the same when abstracting away from irrelevant hence unob- 
served details; 

— Finally, the syntax of a program is shown to be an abstraction of its semantics 
(in that details of the execution are lost) so that the transformed program is an 
abstraction of the transformed semantics. 

Abstract interpretation theory [1,2] provides the ingredients for designing a 
syntactic source-to-source transformation as an abstraction of a semantics-to- 
semantics transformation, which correctness is formally established through an 
observational abstraction. In particular iterative transformation algorithms are 
abstraction of the fixpoint semantics of the subject program. 

Several examples have been studied with this perspective such as blocking 
command elimination [3], program reduction, constant propagation, partial eval- 
uation, etc. 

References 

1. P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In 
gth pQpi^^ pages 269-282, San Antonio, TX, 1979. ACM Press. 

* This work was supported in part by the european FP5 project IST-1999-20527 
Daedalus. 
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2. P. Cousot and R. Consot. Abstract interpretation frameworks. J. Logic and Comp., 
2(4):511-547, Aug. 1992. 

3. P. Cousot and R. Cousot. A case study in abstract interpretation based 
program transformation: Blocking command elimination. ENTCS, 45, 2001. 
http://www.elsevier.nl/locate/entcs/voluine45.html, 23 pages. 

4. N. Jones, C.K. Gomard, and P. Sestoft. Partial Evaluation and Automatic Program 
Generation. Int. Series in Computer Science. Prentice-Hall, June 1993. 

5. N.D. Jones. An introduction to partial evaluation. ACM Comput. Surv., 28(3):480- 
504, Sep. 1996. 




X-tegration — Some Cross-Enterprise Thoughts 



Ashish Gupta 

Chief Strategy Office, Tavant Technologies 
542 Lakeside Drive, Suite 5, Sunnyvale, CA 94085 
http : //www. tavant . com 

One of the main contributions of the internet “revolution” is to make the con- 
cept of connectivity the efault state in the minds of individuals and enterprises. 
Whereas previously it was very difficult to imagine connecting the systems of an 
enterprise to any other enterprise, today business managers are accepting such 
connectivity as inevitable. This new class of applications will be referred to as 
XERP (cross ERP) in this talk. The author discusses the nature of XERP ap- 
plications and how they differ from ERP applications. We will also discuss some 
of the practical problems encountered in making XERP applications a reality. 

Despite the tremendous amount of publicity and hype around the first few 
XERp applications - namely exchanges - the fact remains that they are not 
very common. Yet there is a tremendous amount of software that has been 
built as infrastructure for XERP applications and several hundred companies 
formed to implement XERP applications in different spaces. Most of these have 
been formed using adhoc principles, poorly understood business processes, non- 
existent theoretical foundations, and are frequently ahead of the customer. 

We will discuss some areas where there is possibly room to explore the the- 
oretical foundations of XERP applications. We will also discuss several distin- 
guishing characteristics and therefore opportunities in this space. The talk will 
also be structured to solicit opinions of the audience given the open endedness 
of the topic at hand. 
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Building Real-Life Applications with Prolog 



Jan Wielemaker 

University of Amsterdam, SWI, 

Roeterstraat 18, 1018 WB Amsterdam, The Netherlands 
j amOswi .psy.uva.nl 
http : / /www . swi . psy . uva . nl/usr / j an/ 

SWI-Prolog grew towards a popular free Prolog implementation. It stresses 
on features that make it a useful prototyping tool. SWI-Prolog is closely com- 
patible with Quintus and other Prologs in the Edinburgh family and is used for 
teaching by many Universities. 

XPCE/Prolog, based on XPCE (an Object Oriented GUI toolkit for dynam- 
ically typed languages), is a powerful environment for GUI prototyping and the 
implementation of large interactive systems. It has been used for the develop- 
ment of several knowledge engeneering workbenches. XPGE/Prolog is being used 
by various research groups in universities and industry. 
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Natural Language Tabular Parsing 



Eric Villemonte de la Clergerie 
ATOLL/INRIA 

Domaine de Voluceau - BP 105, 78153 Le Chesnay Cedex - France 
Er ic . De_La_Clergerie@inria . f r 
WWW Home page: http://atoll.inria.fr/~clerger 

This tutorial mostly addresses the use of tabular techniques for parsing natu- 
ral languages. During a tabular evaluation, traces of computations are stored in 
a table in order to detect loops, to share computations and to provide a final jus- 
tification for answers. While widely used in Natural Language Processing to cope 
with the high level of ambiguity found in human languages, tabular techniques 
have also been developed in Functional Programming (memoization), Deductive 
Databases (magic-set), and Logic Programming (tabling). 

We first start with a brief review of different grammatical formalisms, point- 
ing out areas of interest for (Constraint) Logic Programming approaches. 

Then, we cover various tabular techniques (CKY, Chart Parsing, Graph- 
Structured Stacks, Automata & Dynamic Programming) for various formalisms 
(Context-Free Grammars, Unification Grammars, Tree Adjoining Grammars), 
trying to sketch an uniform view of these techniques. We mention the relation- 
ships with tabular techniques used in LP (Magic Set and tabling). 

We briefly present the notions of parse and derivation shared forests produced 
by tabular parsers and their transposition for LP (notion of justification or proof 
forest) . 

The tutorial is illustrated with the presentation of DyALog, a system used 
to compile unification grammars, TAGs and Logic Programs. 

This tutorial derives from a longer one in French delivered at TALN’99, 
whose slides and notes may be found at http://atoll.inria.fr/~clerger/ 
TALN99.html. Slides for this tutorial will be available at http : //atoll . inria. 
fr/~clerger/lCLP01 .html. 
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A Close Look at Constraint-Based Concurrency 



Kazunori Ueda 

Dept, of Information and Computer Science, Waseda University 
3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan 
uedaOueda . inf o . waseda .ac.jp 

Constraint-based concurrency , also known as the cc (concurrent constraint) 
formalism, is a simple framework of concurrency that features (i) asynchronous 
message passing, (ii) polyadicity and data structuring mechanisms, (iii) chan- 
nel mobility, and (iv) nonstrictness (or computing with partial information). 
Needless to say, all these features originate from the use of constraints and log- 
ical variables for interprocess communication and data representation. Another 
feature of constraint-based concurrency is its remarkable stability; all the above 
features were available essentially in its present form by mid 1980’s in concurrent 
logic programming languages. 

Another well-studied framework of concurrency is name-based concurrency 
represented by the family of 7r-calculi, in which names represent both channels 
and tokens conveyed by channels. Some variants of the original 7 r-calculus fea- 
tured asynchronous communication, and some limited the use of names in pursuit 
of nicer semantical properties. These modifications are more or less related to 
the constructs of constraint-based concurrency. Integration of constraint-based 
and name-based concurrency can be found in proposals of calculi such as the 
7 -calculus, the p-calculus and the Fusion calculus, all of which incorporate con- 
straints or name equation into name-based concurrency. 

This tutorial takes a different, analytical approach in relating the two for- 
malisms; we compare the roles of logical variables and names rather than trying 
to integrate one into the other. Although the comparison under their original, un- 
typed setting is not easy, once appropriate type systems are incorporated to both 
camps, name-based communication and constraint-based communication exhibit 
more affinities. The examples of such type systems are linear types for the tt- 
calculus and the mode/linearity/capability systems for Guarded Horn Clauses 
(GHC). Both are concerned with the polarity and multiplicity of communica- 
tion, and prescribe the ways in which communication channels can be used. 
They help in-depth understanding of the phenomena occurring in the course of 
computation. 

The view of constraint-based concurrency that the tutorial intends to provide 
will complement the usual, abstract view based on ask and tell operations on a 
constraint store. For instance, it reveals the highly local nature of a constraint 
store (both at linguistic and implementation levels) which is often understood to 
be a global, shared entity. It also brings resource-consciousness into the logical 
setting. This is expected to be a step towards the deployment of cc languages as a 
common platform of non-sequential computation including parallel, distributed, 
and embedded computing. 
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Probabilistic Databases and Logic Programming 



V.S. Subrahmanian 

University of Maryland, Dept, of Computer Science, USA 
vs@cs .umd. edu 

Uncertainty occurs in the world in many ways. For instance, image processing 
programs identify the content of images with some levels of uncertainty. Predic- 
tion programs predict when events will occur with certain probabilities. In this 
tutorial, I will focus on probabilistic methods to handle uncertainty. 

We will start with a quick review of how probabilistic information can be 
handled in a relational database setting. An extension of the relational algebra 
to handle probabilistic data will be described. Subsequently, we will discuss 
methods to handle probabilities over time in a relational database. Intuitively, 
when we make statements such as “Package p will be delivered sometime today” , 
the granularity of time used will have an impact on how easy/difficult it is to store 
such data. If the granularity of time used is milliseconds, the above statement 
leads to an enormous amount of uncertain information. Techniques to store and 
manipulate such information will be studied. 

We continue with a description of probabilistic logic programming methods 
that extend the above database models to logic programs. We will also discuss 
the incorporation of temporal uncertainty in logic programs. We will describe 
characterizations of such programs in terms of logical model theory, fixpoint 
theory, and proof theory. 

The talk will conclude with some directions for future research. 
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Understanding Memory Management 
in Prolog Systems 



Luis Fernando Castro^ and Vftor Santos Costa^ 

^ Dept, of Computer Science, SUNY at Stony Brook, 
luisOcs . sunysb . edu 

^ COPPE/Sistemas, Universidade Federal do Rio de Janeiro, 
vitorScos . uf r j . br 



Abstract. Actual performance of Prolog based applications largely de- 
pends on how the underlying system implements memory management. 
Most Prolog systems are based on the WAM, which is built around a set 
of stacks. The WAM is highly optimized to recover memory on backtrack- 
ing and on tail-recursive predicates. Still, deterministic computations can 
create intermediate structures that can only be freed through garbage 
collection. 

There is a signihcant amount of literature regarding memory manage- 
ment for Prolog. Unfortunately, we found relatively little data on how 
modern Prolog systems perform memory-wise. Open questions range 
from whether Prolog systems consume the same amount of space, to 
how effective garbage collection is in practice, and to whether we should 
be using sliding or copying based garbage collectors. 

This work aims at investigating the practical aspects of memory man- 
agement in Prolog systems. We present a methodology to compare the 
memory performance of such systems, and we use it to compare two 
different WAM-based systems, namely XSB and Yap. We suggest novel 
techniques for variable shunting and we propose a scheme that can im- 
prove the performance of sliding-based garbage collectors. Last, we eval- 
uate our methodology with larger-scale applications. 



1 Introduction 

Prolog is a high-level, declarative language based on SLD-resolution applied 
over Horn clauses, a powerful subset of First Order Logic. Prolog has been used 
with success in such diverse fields as intelligent database processing, natural 
language processing, machine learning, software engineering, model checking and 
expert system design. Traditional Prolog systems implement a selection function 
that always tries the leftmost goal first and performs a depth-first search with 
backtracking. More recent systems support features such as co-routining for more 
flexible selection of goals, constraints, and tabling. 

Memory allocation is quite an important issue in the implementation of 
Prolog systems. The first main data-structure is the Prolog internal database. 
Database structures are long lived and must be explicitly freed by the program- 
mer. The run-time environment also manages several dynamic data-areas that 
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grow during forward execution and contract during backtracking. In WAM-based 
implementations [17] these data structures are: the Local stack, that maintains 
environments, that is, call records, and choice-points, that is, open alternatives; 
the Global stack or Heap, that maintains data structures whose can last even 
after the procedure that originally created them terminates; and the Trail, that 
stores the data required to reset bindings to variables when backtracking. 

Under some circumstances, namely for deterministic tail recursive calls, the 
WAM can reclaim space on the Local stack during forward execution [17]. Some 
space on the trail can also be reclaimed after pruning. Most Prolog systems there- 
fore only reclaim global stack space either on backtracking or through garbage 
collection. This is unfortunate because in many applications memory consump- 
tion is dominated by the global stack. 

In this work, we study memory allocation in WAM-based systems. We address 
what we believe are two major issues. First, we propose an initial methodology 
to compare the memory performance of Prolog systems, and use it to compare 
two different WAM-based systems. Second, we demonstrate the applicability of 
our methodology on larger-scale applications. 

We next discuss in some more detail the main issues in memory management. 
We then proceed to present our methodology, and present an evaluation for the 
XSBand YAP systems. We start from general memory patterns and then focus 
on garbage collection issues. Next, we evaluate some larger scale applications. 
At last, we present our conclusions and propose future work. 

2 Principles of Memory Management 

The traditional Marseille Prolog used a single stack for the management of Pro- 
log data areas. More efficient systems allow recovering memory during forward 
execution by dividing this stack into three or four stacks. In the WAM these are 
the Heap, the Local (or environment plus choice-point) stack, and the Trail. 

The Local Stack. The Local stack includes environments and choice-points. Envi- 
ronments correspond to live clauses, and are the equivalent of activation records 
on imperative languages. They have two control fields in the WAM, plus a clause 
dependent number of slots for permanent variables, that is, for variables that 
were created in the body of the clause and that span several sub-goals. The 
WAM provides several optimizations for deterministic computations such as last 
call optimization and environment trimming. Choice-points correspond to un- 
explored clauses in the search tree. They are created for non-determinate goals, 
that is, for goals where we cannot prove at entry that at most one clause matches. 
In database terminology they are a snapshot: we are to recover the state of a 
computation at the moment of goal entry from the data in the choice-point. 
Choice-points include the current stack tops, program pointers, and the argu- 
ments for the current goal. Choice-points are allocated at clause entry and are 
recovered at backtracking or after execution of a cut. Cut is quite important, as 
support for last call optimization also guarantees that all environments created 
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after the target choice-point for the cut can be discarded. Several WAM-derived 
systems, such as SICStus Prolog [1] or XSB [12] maintain separate choice-point 
and environment stacks. 

The Trail. The trail is a data structure unique to Prolog. It stores all condi- 
tional bindings, that is, all bindings that must be reset after backtracking. It 
grows as we perform unifications and it contracts during backtracking. Cut may 
also contract the trail, because it means that some bindings will no longer be 
conditional. 

The Heap. The global stack, or Heap, accommodates compound terms and vari- 
ables that do not fit in the environment. More precisely, we can say that the 
global stack stores four different kinds of objects: (i) free variables; (ii) Prolog 
variables bound to constants or compound terms; (iii) references, that is vari- 
ables bound to other variables; and, (iv) functors. A compound term is a functor 
plus an ordered set of possibly bound variables. The WAM does not explicitly 
store the functor for pairs. 

2.1 Garbage Collection 

Garbage collection for Prolog was first implemented in the landmark DEC-10 
Prolog system [16]. The classical approach uses a mark-and-slide collector as 
detailed by Appleby et al. [3] for their seminal work on the SICStus garbage 
collector. The marking step tags all variables reached from the current state, or 
from a choice-point. An extra sweep over the other stacks and two sweeps over 
the Heap adjust the pointers and then compact the Heap. Variable ordering is 
always preserved in this scheme, thus respecting stack segmentation imposed by 
the choice-points. Further optimizations are: 

— Trail references to dead objects can, and must be, reset (or the location they 
point to must be made live). This is called early reset [3]. 

— The top of Heap pointer in choice-points may also point to dead objects. 
One solution is to create a live object at this position. A different solution 
is to keep track of the current choice-point while sweeping the Heap, and 
adjust choice-pointers accordingly. 

— Reference chains can be compressed. This is called variable shunting [11]. 
A complete implementation on systems that perform pointer reversal at 
marking requires an extra step for the garbage collector. 

— It is quite common that the garbage collector will not recover enough space. 
Examples are non-deterministic programs and numerical integer computa- 
tions that may be heavily recursive whilst not creating many objects in the 
Heap. In this case, the system must be able to support stack expansion. 

The main disadvantage of mark and sweep garbage collectors is that they take 
time linear on the size of the Heap. This can be a problem if the Heap grows 
very large. One alternative is to use copying based garbage collectors for Prolog 
(please refer to Demoen and Sagonas for a recent discussion [7]). 
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Last, one should notice that Heap garbage collection can also be used to 
recover cells in the Trail. This is possible through early reset, and in case of 
Yap by discarding unconditional bindings. XSB does not recover trail space at 
garbage collection: XSB performs trail trimming at cut time, and early reset 
does not actually discard trail entries. XSB uses early reset only to recover Heap 
space by releasing compound terms. 

3 Methodology 

Understanding the memory performance of WAM based systems is a hard task. 
Parameters we can experiment with include very different applications, different 
designs for Prolog engines that build different intermediate structures, and how 
much and when the engine tries to recover memory. Variables we can measure 
include total memory sizes, memory distribution per object type, and time for 
the different algorithms. 

In this work we focus on what we argue are two major issues in Prolog 
memory allocation. The first one is to understand how Prolog allocates objects. To 
achieve this goal, we instrumented each benchmark to call the garbage collector 
at fixed points, and to find out how objects were distributed before and after 
garbage collection. Note that our interest at this point is to understand how 
memory is being allocated at “typical” execution points. Questions we address 
are types of objects being allocated, and whether there is a significant variation 
between Prolog engines. Towards this study we chose to use well understood 
benchmarks. 

The second problem we focus on is garbage collection: we would like to know 
whether some well-known techniques for garbage collection of Prolog programs 
are worthwhile. One question of interest we discuss involves collection of aliased 
pointers. The second question involves improving sliding based collection, to- 
wards improving performance. The limitations of our initial methodology are 
shown when studying sliding-based collection, as actual differences are only sig- 
nificant for real, substantial applications. 

The Benchmark Set. Our initial benchmark set includes a mix of deterministic 
and non-deterministic applications. To avoid interference with the stack protec- 
tion algorithm we called garbage collection explicitly. We always place calls to 
the garbage collector at the end of the execution, when most of the data struc- 
tures had been built but before any final cuts. Program sources can be found at 
http://www.cs.sunysb.edu/~luis/gc-bencli.tar.gz. Note that the program 
may call the garbage collector several times. We only present statistics for the 
instance of garbage collection which presented the largest initial Heap usage, and 
mention any significant variations. The applications are Kish Shen’s simulator 
of AND/OR parallelism, the well known Boyer theorem prover benchmark, the 
Gnu-Prolog compiler, Mike Carlton’s chess-playing game, Bruce Holmer’s pro- 
gram for NAND decomposition, and the Chat Natural Language Analyser and 
database interface. 
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We use two systems to perform our analysis: XSB and Yap. Both are WAM 
based. Yap has a Local stack, whereas XSB has a choice-point and an envi- 
ronment stack. Before this work, Yap only supported mark-and-slide garbage 
collection, whereas XSB supported both mark-and-slide and copying garbage 
collection. 



4 Memory Management for WAM-Based Systems 

Table 1 presents the first results of our analysis: we show how many choice-points 
were active when we called the garbage collector and how much stack space was 
being allocated at this point. The results indicate a clear difference between 
the first three deterministic benchmarks, and the remaining non-deterministic 
benchmarks. 

For Yap, all the deterministic benchmarks show little usage of local memory. 
Only gprolog creates two choice-points, the other 4 choice-points are from the 
top-level. Little space is used on environments, which shows how effective last 
call optimization can be. Heap usage is quite high, and fully dominates memory 
usage. Yap actually consumes more trail than Local stack for these benchmarks, 
since it does not recover Trail space when executing cut. 

For XSB, the numbers for the Local stack column are the sum of those for 
the environment and choice-point stacks. Regarding choice-points, the behavior 
of XSB is mostly consistent with that of Yap. XSB always uses more Local 
stack than Yap. The gprolog benchmark creates considerably more choice-points 
and uses much more local stack before the first run of the garbage collector. 
We have traced this problem to be due to a non-optimal implementation of 
backtracking hooks support which created unnecessary choice-points. A fix has 
been created and included in the system, but since the extra choice-points don’t 
affect the rest of the results significantly, we decided to show the results as per 
XSB Version 2.3’s behavior. The effect of Trail compaction on pruning can be 
seen on the differences on Trail usage on the deterministic benchmarks. XSB is 
able to recover Trail space almost completely before garbage collection is started. 



Table 1. Memory Usage before Garbage Collection. 



Programs 


Choice-Points 1 


Local Stack | 


1 Heap 1 


1 Trail | 


Yap 


XSB 


Yap 


XSB 


Yap 


XSB 


Yap 


XSB 


sim 


4 


9 


1018 


2304 


336580 


285402 


38443 


4 


boyer 


4 


9 


86 


172 


408880 


144000 


58584 


4 


gprolog 


6 


369 


125 


13082 


124882 


192571 


2914 


2154 


nand 


264 


296 


4730 


6306 


1859 


1425 


1096 


830 


chat 


242 


293 


4686 


6107 


4423 


4028 


1355 


879 


chess 


12996 


17968 


291091 


445701 


122301 


88361 


32723 


32388 
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The non-deterministic benchmarks present a very different story. The nand 
benchmark performs search over a shallow tree with a high branching factor. 
It has a very small memory footprint, even for large running-times. We were 
surprised by the high number of choice-points it creates. On closer analysis we 
found out that most choice-point are for deterministic set operations, and could 
have been pruned away. 

Execution of chat follows two different stages: parsing and query optimiza- 
tion. In both cases, the number of choice-points and the Heap memory usage 
tend to increase with problem size. In the benchmark, chat handles relatively 
small problems, so the memory footprint is small. We were also surprised at 
finding so many choice-points for chat. In this case, closer analysis showed that 
missing cuts in simplify/3 and split_quants/6 were the main culprits, but 
even so chat has quite a few non-deterministic procedures. The chess bench- 
mark is quite interesting: it creates a huge number of choice-points, but it also 
creates objects in memory. Again, more than 90% of them correspond to a de- 
terministic procedure, strength/3. Notice that Trail usage is significant in the 
non-deterministic benchmarks. In nand Trail usage is close to Heap usage. This 
is because almost every binding is conditional and few data structures need to 
be created in the Heap. The ratio is smaller for the other benchmarks, but it is 
still important. 

The Heap. The Heap has some of the more interesting results in Table 1, with 
significant differences between XSB and Yap. To better understand these results. 
Table 2 shows which objects we can find in the Heap before we enter garbage 
collection. The first three columns represent unbound variables, references, and 
their percentage over the total number of cells. The next four columns represent 
cells bound to constants (atoms or numbers), pairs, compound terms, and their 
total percentage. The last two columns represent the number of functors and the 
total percentage. These numbers have been obtained before garbage collection, 
so some (often most) of these objects are garbage. 

Constants are the most popular object for most benchmarks. Prolog terms 
are trees, and for these benchmarks most leaves are constants, not unbound 
variables. The two exceptions are gprolog and chess. Both heavily use lists 
and (partially instantiated) compound terms. Unbound variables are only a small 
percentage of total Heap objects: chat with XSB is the one benchmark where 
more than 10% of the Heap is consumed with unbound variables, and this seems 
to be an artifact of the XSB implementation as Yap has much less unbound 
variables. References are important in Yap, less so in XSB. 

The column on Functors gives the number of compound terms that were 
actually built. The correlation between the number of Functors and compound 
terms shows how many terms are being shared in the Prolog Heap: this sharing 
is particularly important in the gprolog compiler and in the chess program. 

The results show huge differences between XSB and Yap. XSB performs much 
better in the deterministic benchmarks sim and especially in boyer, where Yap 
consumes three times as much memory. It performs worse in gprolog. XSB 
also performs better for the non-deterministic benchmarks, though by a smaller 
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Table 2. Memory Objects before Garbage Collection. 





1 Variables | 


Objects 1 


1 Functors | 


Programs 


1 Unb 


Refs 


1 ^ 


Const 


Pairs 1 Comp 


1 ^ 


1 Functor 


1 ^ 


1 Yap 1 


sim 


4267 


39082 


12.93% 


153504 


23643 


57160 


69.88% 


57625 


17.19% 


boyer 


1 


94063 


23.01% 


194100 


19 


79415 


66.90% 


41282 


10.10% 


gprolog 


1185 


4012 


4.19% 


12881 


49689 


49906 


90.58% 


6501 


5.24% 


nEuid 


5 


2 


0.38% 


1162 


472 


no 


93.81% 


108 


5.81% 


chat 


33 


322 


8.03% 


1786 


339 


1127 


73.52% 


816 


18.45% 


chess 


126 


11039 


9.13% 


21872 


42591 


38281 


84.01% 


8392 


6.86% 


1 XSB II 


sim 


3918 


20107 


8.42% 


118689 


24754 


59464 


71.10% 


58470 


20.49% 


boyer 


17 


5 


0.02% 


60710 


21 


41993 


71.34% 


41254 


28.65% 


gprolog 


14393 


4712 


9.92% 


41899 


75013 


49505 


86.42% 


7049 


3.66% 


namd 


26 


0 


1.82% 


770 


472 


71 


92.14% 


86 


6.04% 


chat 


613 


154 


19.04% 


1349 


326 


911 


60.20% 


675 


16.76% 


chess 


811 


4566 


6.09% 


13952 


27805 


34562 


86.37% 


6665 


7.54% 



factor. We researched into these differences and found two major contributing 
factors, (i) the compiler; and (ii) built-in implementation. 

Regarding (i) , the Yap compiler traditionally allocates void variables or tem- 
porary variables in body goals in the Heap, not in the current environment. This 
is quite a bad idea if the program is deterministic and tail recursive, because 
Heap space is only recovered through garbage collection, but Local Stack will be 
recovered at last call. For instance, we found that the difference for the boyer 
benchmark results from the Yap compiler reserving Heap cells to store results 
for the arg/3 and functor/3 built-ins. XSB allocates the same cells in the Local 
stack, and manages to reduce Heap usage by a factor of three. The Yap-4.3.19 
compiler addresses these problems by forcing void variables to be allocated in the 
Local Stack and by not initializing arguments to calls of functor/3 and arg/3 
if we previously know the modes. Similar experience has been reported in [6]. 
Factor (ii) occurs when we have very different implementations of the same 
built-in. The factor explains why the number of Functor cells in the Heap differs 
between the two systems, as both implement copying and will create the same 
number of compound terms, modulo built-ins. We found that Yap would often 
use more space because the write/1 and nl/1 built-ins would leave garbage, 
such as terms describing the current stream, in the Heap. 



5 Garbage Collection 

The Heap is the dominant area in the deterministic benchmarks, and is quite 
important for non-deterministic benchmarks. Heap garbage collection recovers 
memory in the Heap and, in the case of Yap, also in the Trail. Table 3 shows 
how garbage collection performs on both systems. 
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Table 3. Effectiveness of Garbage Collection. 



Programs 


1 Heap 


Trail I 


1 Yap 


XSB 


Yap 


XSB II 


sim 


6558 


98% 


8252 


98% 


2 


99% 


4 


0% 


boyer 


21 


99% 


39809 


73% 


2 


99% 


4 


0% 


gprolog 


6247 


94% 


14663 


93% 


238 


91% 


1175 


54.55% 


nand 


1468 


21% 


1332 


7% 


596 


45% 


122 


14.70% 


chat 


1850 


58% 


2037 


50% 


433 


68% 


286 


32.54% 


chess 


66798 


45% 


67503 


24% 


979 


97% 


26393 


81.49% 



Garbage collection is very effective for the deterministic benchmarks. For 
Yap, in all three cases more than nine in every ten cells were garbage. The 
boyer benchmark is an extreme case: Yap can recognize that almost everything 
is garbage at the point we force garbage collection. Garbage collection for sim 
was also performed quite at the end of the computation, and almost everything 
was recovered. Results for gprolog are somewhat worst, but well above 90%. We 
would expect these results to improve further as we increase benchmark size. The 
XSB garbage collector is somewhat less effective than Yap’s. This is especially 
true on boyer. We found that environment trimming plays an important role in 
the efficiency of the collection for boyer. The test predicate stores the resulting 
term, which is quite large, in a local variable. This variable is dead by the time we 
call garbage collection. In fact, XSB does not perform environment trimming, 
nor does it implement any more sophisticated method to inform the garbage 
collector of live variables. The term ends up being marked, thus resulting in the 
large difference of efficiency between XSB and Yap. 

The story is quite different for the non-deterministic benchmarks. The nand 
benchmark consumes little memory, and there is very little to recover. Only one 
in five cells is garbage. Notice that XSB recovers less cells than Yap, but that is 
only because it was using less cells in the first place. Ultimately, the intermediate 
data-structures we build in this particular query for nand are very small, and 
we build little we can discard. Parsing in chat is much more deterministic and 
we can recover quite a lot more: about half the cells are garbage. The worst 
result for both garbage collectors is from chess. Although we consume much 
more memory than for the other non-deterministic benchmarks, most Heap cells 
are still reachable at this point. This suggests that we can find non-deterministic 
programs which used significant amounts of Heap but where garbage collection 
may not perform well. Again, both Yap and XSB turn out to mark about the 
same number of cells in chess. Yap does recover more memory because in Yap 
temporary variables were stored in the Heap. These variables are easily recovered 
by the garbage collector. XSB stores the same variables in the environments, 
but space for the environments is protected by the choice-points and cannot 
be recovered. So it turns out that storing temporary variables in the Heap is 
a better strategy for chess, as space can always be recovered through garbage 
collection! We also experimented with removing the most obvious shallow choice- 
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points by introducing cuts in deterministic procedures, but our results do not 
show significant improvements. 

Heap Objects. An analysis of the Heap objects after garbage collection revealed 
that the ratios between kinds of objects didn’t change considerably. One inter- 
esting result is that most of the references left by Prolog were actually garbage. 
Even chat had a substantial reduction in references, eliminating about 70% of 
them. A second interesting observation is that the number of compound terms 
now closely tracks the numbers of functors for all benchmarks but chess. Only 
in the case of chess we still have many references to the same compound term 
(with 30359 compound terms and 4461 functors). This suggests that only one 
pointer to a compound term is actually useful. Finally, in chat, the number 
of free variables actually increases after garbage collection for Yap (from 33 to 
121), whereas in XSB it strongly decreases (from 613 to 17). This is because 
Yap implements early reset by actually resetting the memory position to be a 
variable, whereas XSB stores a constant. This observation suggests that early 
reset is in fact quite important for chat. 



5.1 Reference Chains 

It has been shown before that reference chains are small in Prolog programs [13,14] 
In fact there are few references. Table 4 shows how deep reference chains go in 
the two systems. We count reference chains starting both from the Heap and 
from Environments. 

The longest chains are in sim and then in chat, gprolog has less variables, 
and also smaller reference chains. The sim benchmark is remarkable as it actually 
exhibits a substantial number of reference chains with three cells. Also, the 
number of double reference chains is significant. The differences between Yap 
and XSB for gprolog are probably caused by different builtin implementation. 

Variable Shunting. Variable shunting is an optimization where we try to reduce 
the length of reference chains by jumping over a member of the chain. Reference 
shunting for the WAM is discussed by Sahlin and Carlsson [11]. A major problem 
with complete variable shunting for the WAM is that it requires an extra step. We 



Table 4. Reference Chains after Garbage Collection. 



Programs 


Yap 


1 XSB 11 


0 


1 


2 


3 


0 


1 


2 


3 


sim 


1006 


867 


256 


10 


921 


873 


248 


10 


boyer 


2 


4 


0 


0 


3 


5 


0 


0 


gprolog 


2 


407 


0 


0 


382 


1075 


0 


0 


nand 


22 


356 


0 


0 


7 


363 


0 


0 


chat 


238 


328 


26 


2 


45 


350 


22 


6 


chess 


100 


19 


0 


0 


803 


105 


0 


0 
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next discuss two simple transformations, easy shunting and easy undeffing, that 
can be implemented with little overhead. The transformations are performed at 
marking time. We experimented with Yap. 




Fig. 1. Easy Shunting 



Easy shunting processes two unmarked cells. The first cell, CUR, is an un- 
marked reference. It may be placed in the Heap or in the Local stack. Moreover, 
CUR may be older or younger than the current choice-point. The second variable, 
NEXT, is a reference, a compound term, or a constant, but NEXT must be younger 
than the current choice point. We know that since NEXT is unmarked so far, it 
has not been bound from a more recent choice-point (otherwise it would have 
been reached already). We also know that it has not been trailed so far (oth- 
erwise it would have been reached) and we know the binding is deterministic, 
as the cell is more recent than the current choice-point. So we know that the 
binding for NEXT is unconditional, and we can replace the value in CUR with the 
value in NEXT. A further improvement to easy shunting has been implemented in 
Yap. We would like to also process chains with marked variables, but shunting 
is incorrect if the marked variable was bound before the latest choice-point. Yap 
addresses this case by resetting all Heap variables when marking the trail (they 
are still marked) . Therefore, all current references must have been bound at the 
current choice-point. We believe our result is close to what Sahlin’s algorithm 
achieves, without the need for an extra step. In fact, Sahlin himself [11] suggests 
the extra step is only required because SICStus uses pointer reversal, which is 
not used in Yap. 

The second transformation, easy undeffing, processes two unmarked cells. 
The first cell is an unmarked reference in the Heap, which must be more recent 
than the previous choice point. The second cell is an unbound cell in the heap, 
and must also be more recent than the current choice-point. The transformation 
simply moves the unbound variable from NEXT to CUR, that is, we reset CUR and 
set NEXT point at CUR 




NEXT 






CUR 




Fig. 2. Easy Undeffing 
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Easy undeffing is most useful if we never need to mark NEXT. In this case, we 
can actually recover a cell. In fact, our motivation for simple undeffing derives 
from the Prolog system’s parser. When parsing Prolog terms that contain vari- 
ables, we first create the variable, and then point to it from the Prolog term. 
There is therefore benefit in undeffing the variable to the term at garbage collec- 
tion. On the other hand, there is little benefit to this optimization if NEXT is part 
of some compound term and eventually gets marked. An ideal solution would 
thus be to only undefine a cell if its reference count is zero. This was indeed 
recently proposed by Demoen [5]. On the other hand, we believe our solution 
catches a common case and it does have the advantage of not requiring reference 
counts. 



Table 5. Reference Chains after Shunting. 





Chains I 


1 Total II 




0 


1 


2 


3 


without 


with II 


sim 


962 


218 


33 


6 


6558 


5649 


16% 


boyer 


2 


4 


0 


0 


21 


21 


0% 


gprolog 


2 


5 


0 


0 


6247 


6046 


3.3% 


nand 


22 


356 


0 


0 


1468 


1468 


0% 


chat 


265 


313 


26 


2 


1850 


1819 


1.7% 


chess 


100 


19 


0 


0 


66798 


66798 


0% 



Table 5 shows the performance of these two simple optimizations. The three 
benchmarks that benefit are the ones where variables and references count, sim, 
gprolog, and chat. In all three cases the benefit stems from easy shunting, never 
from easy undeffing. We believe this is because we find so few unbound variables. 
The improvements are most impressive in the case of sim and of gprolog where 
Heap references almost disappear, generating a 16% space improvement for sim. 

5.2 Garbage Collection Performance 

We have so far concentrated on how well the garbage collector performs on 
its task, recovering Heap (and maybe Trail) cells. Actual applications, namely 
deterministic applications, may call the garbage collector very often. Garbage 
collection time may thus be a substantial component of running time. We would 
like to reduce this time to a minimum. Traditionally, Prolog systems rely on 
mark-and-slide garbage collection [3]. Such garbage collectors preserve Prolog 
segments and variable ordering but they require time proportional to the total 
amount of Heap. This is extremely inefficient if, for example, only 1% of the 
cells have been marked, as is the case in boyer. In contrast, copying garbage 
collectors require time proportional to the amount of live data, but can only use 
half of total stack space and make it harder to efficiently support choice-points. 

We propose a technique that takes time proportional to the original Heap. 
We name it indirect sliding. The idea is that we use an extra, intermediate area 
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to store pointers to every marked cell. We then sort these pointers and use them 
to perform sliding. Our algorithm works as follows: 

1. In the marking phase, whenever we mark a new cell, we also store a pointer 
to the cell in an extra data area, say "H*; 

2. if the number of pointers in "H* is below some threshold, then we sort them, 
otherwise we discard them and perform traditional sliding; 

3. sweep the pointer array, using the fact that if pointer points to cell 

H\j], cell H[j] will move to H[i]. 

XSB uses an extra buffer to store this area. Yap uses the area between Heap 
and Local stack. We changed Yap so that overflow is detected if the free space 
between stacks goes below 1/8 of total stack space. While storing indirection 
cells we first check if T~L* overflowed into local space. If it does, we stop storing 
pointers and instead commit to standard mark-and-slide collection. Otherwise, 
we use indirection if less than 10% of the Heap cells were marked. XSB uses a 
threshold of 20%. Note that one major advantage of indirection is that we can 
wait until we have enough information to commit to a specific strategy. 

To our knowledge, a similar idea was first proposed for the LaTTe Java vir- 
tual machine mark and sweep collector [4], where it is called selective sweeping. 
LaTTe names our pointer area the set of live objects. Their algorithm thus works 
at object level, not at cell level. This is an advantage of Java, a language where 
pointers can only point to objects. Sahlin’s 0(nlogn) algorithm for Prolog works 
from a similar principle [10], but instead of using an extra area it requires an 
extra step to find a set of live blocks. We propose indirection for mark and slide 
collection, but indirection can be also useful for copying-based garbage collec- 
tors. The idea is to maintain TL* as a heap, and always mark first the largest 
pointer in the heap. Reverse pointers can be accessed through the Trail. 

Performance Evaluation for Yap. In order the study the performance of these 
techniques we compared performance using three deterministic Prolog programs. 
The results are shown in Table 6. The emul benchmark is from Van Roy’s bench- 
mark set, fsa is van Noord’s finite state machine running the test dg5 [15], and 
bn_lbl_kernel is a test run for Angelopoulos and Cussens’ Markov Chain Mon- 
teCarlo’s system [2] . We fix the stack space and force a fixed strategy throughout 
execution. Both garbage collectors are called the same number of times and col- 
lect the same amount of garbage. Times shown are the accumulated time spent 
on garbage collection during the whole execution of the programs. 

The three examples show clear superiority of the indirection algorithm for de- 
terministic benchmarks. The emul benchmark has a fixed working set of about 
65KB. For the smallest memory configurations efficiency is lower than 80%, 
and the overhead of the sorting algorithm makes indirection slower. As we grow 
stack size, the number of garbage collections decreases linearly, but sliding-based 
garbage collection must still go through an increasing Heap. On the other hand, 
performance of the indirection garbage collector improves because the working 
set is constant. Results for indirection could be even better if not for the in- 
crease in trail size: for larger configurations indirection actually spends more 
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Table 6. Performance of Garbage Collection Algorithms for Yap. 





1 emul 


1 1 


bn Ibl kernel II 


heap 


slide 


indirect 


slide 


indirect 


slide 


indirect 


512 


11.33 


11.57 


5.75 


6.91 


0.99 


0.57 


1024 


6.67 


5.31 


5.19 


6.16 


0.69 


0.20 


2048 


4.93 


2.71 


3.32 


3.25 


0.67 


0.16 


4096 


3.93 


1.48 


2.13 


1.65 


0.67 


0.13 


8192 


3.70 


1.04 


1.60 


0.76 


0.62 


0.11 


16384 


3.33 


0.60 


1.30 


0.42 


0.48 


0.13 


32768 


3.06 


0.60 


1.07 


0.24 


0.61 


0.11 


65536 


3.06 


0.51 


0.97 


0.12 


0.62 


0.10 


131072 


3.01 


0.24 


0.96 


0.10 


N/A 


N/A 



time cleaning the Trail than the Heap. The story for bn_lbl_kernel is similar. 
Only difference is that this is a small example, so the working set oscillates be- 
tween 3KB and 80KB and indirection is always better. Again, Trail sweeping 
dominates for the largest configurations. The fsa benchmark is somewhat dif- 
ferent because the working set grows in seesaws from as little as 4KB to as much 
as 300KB. It is a program that can greatly benefit from generational garbage 
collection [9] . Indirection works badly when we have bad efficiency, so the results 
are better for sliding at first. As usual, indirection benefits the most from larger 
stacks. 

Performance Evaluation for XSB. The performance evaluation of the garbage 
collectors in XSB was carried out by analyzing three benchmarks, emul is the 
BAM emulator presented in the previous section. The iproto [8] benchmark is 
an application of the XMC model checking system which relies heavily on the 
tabling mechanisms of XSB. Finally, the justifier builds a justification proof 
for a property verified by the XMC model checker. 

In Table 7 we fix the garbage collection strategy and change the size of the 
initial size of the Heap. For all benchmarks, the copying collector is faster than 
the others. One important thing to notice is that the copying collector in XSB 
is not segment-preserving. This interacts with backtracking, so the amount of 
data collected may be different than for the other collectors. 

For the emul benchmark, indirect sliding provides an interesting alternative 
to copying, when compared to sliding. The iproto is an interesting benchmark, 
in that the collector always marks more than 20% of the original heap. The 
indirection mechanism, in this case, is never used. Still, the results show that 
the overhead in creating the pointer buffer is small. The justifier benchmark 
has a behavior similar to iproto, except that the last collection is able to collect 
most of the heap. Indirect sliding is only faster when we start with a heap large 
enough so that the number of collections is small. 
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Table 7. Performance of Garbage Collectors for XSB 



heap 

Heap Size 


emul 


1 iproto 


1 justifler I 


copy 


slide 


indirect 


copy 


slide 


indirect 


copy 


slide 


indirect 


512 


1.68 


2.72 


3.86 


3.85 


4.97 


4.91 


21.52 


25.69 


28.17 


1024 


0.78 


1.49 


1.60 


3.51 


4.72 


4.80 


21.10 


26.25 


26.78 


2048 


0.34 


1.00 


0.80 


3.14 


4.23 


4.26 


20.98 


25.94 


26.80 


4096 


0.20 


0.71 


0.41 


1.03 


1.31 


1.25 


21.13 


26.20 


24.94 


8192 


0.11 


0.57 


0.18 


0.18 


0.27 


0.33 


4.58 


5.90 


6.46 


16384 


0.04 


0.48 


0.10 


n/a 


n/a 


n/a 


1.81 


2.61 


4.60 


32768 


0.02 


0.49 


0.04 


n/a 


n/a 


n/a 


0.69 


1.33 


1.91 


65536 


0.01 


0.44 


0.02 


n/a 


n/a 


n/a 


0.26 


0.77 


0.64 


131072 


0.01 


0.41 


0.02 


n/a 


n/a 


n/a 


0.03 


0.97 


0.03 



6 Conclusions 

We present what we believe is the first systematic and comparative analysis 
of memory allocation in two Prolog systems, XSB and Yap. Both systems are 
based on the same underlying abstract machine, the WAM, and we would expect 
similar memory usage. In fact, we found significant differences. One difference 
we expected is that Yap uses much more Trail than XSB. On the other hand, 
we were surprised that what was considered a relatively minor issue, allocation 
of temporary variables in the body of clauses, could have such a major impact 
on several different benchmarks. There is no always-best solution: environment 
allocation is the best solution for deterministic tail-recursive programs, but is 
worse than Heap allocation for all other programs. Fortunately, this problem 
can often be addressed by not initializing output arguments to built-ins. Built- 
in implementation is indeed a determinant factor in memory usage. Built-ins may 
produce the correct results and leave garbage or, even worse, unnecessary choice- 
points in the stacks. In the worst case this will compromise last call optimization 
and kill application performance for large queries. Comparative analysis, as we 
did for Yap and XSB, is a good way to clarify these issues. 

Both systems performed similarly in garbage collection except for boyer, 
where XSB suffered significantly from not implementing variable trimming. Again 
an arguably minor issue showed itself to have a huge impact on system perfor- 
mance. Note that garbage collection does not always work for Prolog, as shown by 
chess. Our study also showed a significant interaction between cut and garbage 
collection. We also studied two simple alternatives to the variable shunting tech- 
niques that have been proposed for SICStus Prolog. Although few programs do 
benefit from them, we found programs that do, and we show that non pointer 
reversal-based systems can implement shunting with good results for a rather 
small overhead. Last, we presented indirect sliding, a solution that like copying 
takes time dependent on the live set, and that for a small overhead gives us the 
freedom to choose the best method just before we scan the Heap. Our results 
show significant improvements. 
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Our work has resulted in several improvements to both Yap’s and XSB’s 
memory management systems. We would like to improve these systems even 
further. One problem we found is that it is very hard to detect why two sys- 
tems are consuming different amounts of memory. We would like to build better 
tools towards facilitating the understanding of these issues. Moreover, we have 
so far concentrated on Prolog, but constraint and tabling programs deserve to 
be studied. Namely, the results we obtained for justifier showed very in- 
teresting properties that require further analysis. We would also like to study 
how generations, as implemented in SICStus Prolog and ECLiPSe, can further 
improve garbage collector performance, and whether copying can indeed also 
benefit from indirection. Ultimately, we hope that our contributions towards a 
systematic approach to memory management will result in more robust logic 
programming systems that can support a wider number of applications well. 
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Abstract. This paper describes the development of the PALS system, 
an implementation of Prolog that efficiently exploits or-parallelism on 
share-nothing platforms. PALS makes use of a novel technique, called 
incremental stack-splitting. The technique builds on the stack-splitting 
approach, which in turn is an evolution of the stack-copying method used 
in a variety of parallel logic systems. This is the first distributed imple- 
mentation based on the stack-splitting method ever realized. Experimen- 
tal results obtained on a Beowulf system are presented and analyzed. 



1 Introduction 

Or-parallelism (OP) arises from the non-determinism implicit in the process 
of reducing a given subgoal using different clauses of the program. The non- 
deterministic structure of a logic programming execution is commonly depicted 
in the form of a search tree (a.k.a. or-tree). Each internal node represents a 
choice-point, i.e., an execution point where multiple clauses are available to 
reduce the selected subgoal. Leaves of the tree represent either failure points 
(i.e., resolvents where the selected subgoal does not have a matching clause) or 
success points (i.e., solutions to the initial goal). A sequential computation boils 
down to traversal of this search tree according to some predefined search strategy. 
While a sequential execution attempts to use one clause at the time to reduce 
each subgoal, eventually using backtracking to explore the use of alternative 
clauses, OP allows the use of different threads of execution {computing agents) 
to concurrently explore distinct alternatives emanating from a choice-point. If 
an unexplored branch (i.e., an untried clause to resolve a selected subgoal) is 
found, the agent picks it up and begins execution. This agent will stop either if 
it fails (reaches a failing leaf), or if it finds a solution. In case of failure, or if the 
solution found is not acceptable to the user, the agent will backtrack, i.e., move 
back up in the tree, looking for other choice-points with untried alternatives to 
explore. The agents may need to synchronize if they access the same node in the 
tree. Intuitively, OP allows the concurrent search of alternative solutions to the 
original goal. The importance of the research on efficient techniques for handling 
OP arises from the generality of the problem — technology originally developed 
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for parallel execution of Prolog has found application in areas such as constraint 
programming (e.g., [17,13]) and non-monotonic reasoning (e.g., [14]). 

Most research on OP execution of Prolog has focused on techniques aimed 
at shared- memory multiprocessors (SMMs). In this paper we are concerned with 
the development of execution models for exploitation of OP from Prolog pro- 
grams on Distributed Memory Architectures (DMPs) — i.e., architectures that do 
not provide any centralized memory resource. The techniques we propose are 
immediately applicable to other systems based on the same underlying model, 
e.g., constraint programming [17] and non-monotonic reasoning [14] systems. 
Other proposals for OP on DMPs have also been recently proposed [8,18,3]. 

Experimental [1] and theoretical studies [15] have also demonstrated that 
stack- copying, and in particular incremental stack-copying, is one of the most 
effective implementation techniques for exploiting OP that one can devise. Stack- 
copying allows sharing of work between parallel agents by copying the state of one 
agent (which owns unexploited tasks) to another agent (which is currently idle). 
The idea of incremental stack-copying is to only copy the difference between 
the state of two agents. Incremental stack-copying has been used to implement 
or-parallel Prolog efficiently in a variety of systems (e.g., MUSE [1], YAP [16]), 
as well as to exploit parallelism from constraint systems [17] and non-monotonic 
reasoning systems [14]. In order to further reduce the communication during 
stack-copying and make its implementation efficient on share-nothing platforms, 
a new technique, called stack-splitting, has recently been proposed [11]. In this 
paper, we describe the first ever concrete implementation of stack-splitting on 
a DMP platform — specifically a Pentium-based Beowulf — along with a novel 
scheme to combine incremental copying with stack-splitting on DMPs. The incre- 
mental stack-splitting scheme is based on a procedure which labels choice-points 
and then compares the labels to determine the fragments of memory areas that 
need to be exchanged between agents. We also describe a scheduling scheme 
which is suitable to be used with this novel incremental stack-splitting scheme. 
Both the incremental stack-splitting and the scheduling schemes described have 
been implemented in the PALS system, a message-passing OP implementation 
of Prolog. In this paper we present performance results obtained from this im- 
plementation. To our knowledge, PALS is the first OP implementation of Prolog 
on a Beowulf architecture (built from off-the-shelf components). 



2 Stack-Splitting 

Relatively few efforts [18,9,3,8,7,6] have been devoted to implementing logic 
programming systems on DMPs. Some of the older proposals (e.g., [7,6]) re- 
lied on variations of stack-copying, while the most recent proposals (e.g., [8,18]) 
make use of alternative schemes. Out of these efforts only a small number have 
been implemented as working prototypes, and even fewer have produced accept- 
able speed-ups. Existing techniques developed for SMMs are mostly inadequate 
for the needs of DMPs. Most implementation methods require sharing of data 
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and/or control stacks to work correctly. Even if the need to share data stacks is 
eliminated — as in stack-copying — the need to share the control stack still exists. 

2.1 The Need for a Different Stack-Copying Model 

Traditional stack-copying relies on idle agents copying data structures from busy 
agents in order to obtain new tasks. In traditional stack-copying, as implemented 
in MUSE, backtracking on a choice-point which has been shared between two 
or more agents, requires acquiring exclusive access to the corresponding shared 
frame. Shared frames are associated to each copied choice-point and used to 
maintain a shared representation of the alternatives available in such choice- 
point. The use of shared frames with mutually exclusive access guarantees that 
no two agents explore the same alternative. This solution works well on SMMs — 
where mutual exclusion is implemented using locks. However, on a DMP this 
process is a source of overhead, since the shared area becomes a bottleneck [4] . 

Nevertheless, stack-copying has been recognized as one of the best repre- 
sentation methodologies to support OP in a DMP setting [9, 3, 7, 6]. This is be- 
cause, while the choice-points are shared (through the shared frames), at least 
all the other data-structures, such as the environment, the trail, and the heap 
are not. Other environment representation schemes proposed for OP require 
more extensive sharing of data structures and seem less suitable to support ex- 
ecution on DMPs (although some recent efforts for adapting the binding array 
scheme to DMPs — through the use of distributed shared-memory — have been 
studied [18,8]). To avoid the problem of sharing choice-points in distributed 
implementations, many developers have reverted back to the scheduling on top- 
most choice-point strategy [3,6,9]. This methodology transfers between agents 
only the highest choice-point (i.e., closer to the root) in the computation or-tree 
which contains unexplored alternatives. The reasoning is that untried alterna- 
tives of a choice-point created higher up in the or-tree are more likely to gen- 
erate large subtrees as well as minimize the amount of computation “shared” 
by different agents. Furthermore, this is guaranteed to be the only choice-point 
with unexplored alternatives shared between agents. However, if the granularity 
of the branches in the top-most choice-points does not turn out to be large, 
then another untried alternative has to be picked and a new copying operation 
performed. In contrast, in scheduling on bottom-most choice-point more work 
can be found via backtracking, since more choice-points are copied during the 
same sharing operation. Scheduling on bottom-most choice-point is character- 
ized by the fact that all the choice-points owned by one agent are copied during 
a sharing operation. Additionally, scheduling on bottom-most is closer to the 
depth-first search strategy used by sequential systems, and facilitates support of 
Prolog semantics. Research done on comparing scheduling strategies indicates 
that scheduling on bottom-most is superior to scheduling on top-most [5]. This 
is especially true for stack-copying because: (i) the number of copying operations 
is minimized; and, (ii) the alternatives in the choice-points copied are “cheap” 
sources of additional work, available via backtracking. However, the shared na- 
ture of choice-points is a major drawback for stack-copying on DMPs. 
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2.2 Stack-Splitting Copying Model 

In the stack-copying approach, the primary reason why a choice-point has to be 
shared is because we want to serialize the selection of untried alternatives, so that 
no two agents can pick the same alternative. The shared frame is locked while 
the alternative is selected to achieve this effect. However, there are other simple 
ways of ensuring the same property: perform a splitting of the choice-points, i.e., 
each agent is given all the alternatives of alternate choice-points (See Fig. 1). In 
this case, the list of choice-points is split between the two agents. We call this 
operation choice-point stack-splitting or simply stack- splitting. 




Stack-splitting will ensure that no two agents pick the same alternative. The 
need for a shared frame, as a critical section to protect the alternatives from 
multiple executions, has disappeared, as each stack copy has a different choice- 
point. All the choice-points can be evenly split in this way during the copying 
operation. The major advantage of stack-splitting is that scheduling on bottom- 
most can still be used without incurring huge communication overheads. Es- 
sentially, after splitting, the different or-parallel threads become independent of 
each other, and hence communication is minimized during execution. This makes 
the stack-splitting technique highly suitable for DMPs. Observe that alternative 
splitting strategies may also be designed — e.g., dividing the alternatives within 
each choice-point between the two agents [11]. 

The shared frames in the stack-copying technique are used to maintain global 
information related to scheduling. The shared frames provide a global description 
of the or-tree, and each shared frame records which agent is working in which 
part of the tree. This last piece of information is needed to support scheduling in 
stack-copying systems — work is taken from the agent that is “closer” in the or- 
tree, thus reducing the amount of information to be copied. The shared frames 
ensure accessibility of this information to all agents, providing a consistent view 
of the computation. However, under stack-splitting the shared frames no longer 
exist; scheduling and work-load information will have to be maintained in some 
other way. They could be kept in a global shared area, as in the case of SMMs — 
e.g., by building a representation of the or-tree — or distributed over multiple 
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agents and accessed by message passing in case of DMPs. Shared frames are 
also employed in MUSE [1] to detect the Prolog order of choice-points, needed 
to execute order-sensitive predicates (e.g., side-effects) in the correct order. As 
in the case of scheduling, some information regarding global ordering of choice- 
points needs to be maintained to execute order-sensitive predicates in the correct 
order. In this paper however we do not handle side-effects and order sensitive 
predicates. Thus, stack-splitting does not completely remove the need of a shared 
description of the or-tree. On the other hand, the use of stack-splitting mitigates 
the impact of accessing shared resources — e.g., stack-splitting allows scheduling 
on bottom-most which reduces the number of calls to the scheduler. 

Stack-splitting has the potential to improve locality of computation, reduce 
communication between agents, and improve cache behavior. Indeed, the SMM 
implementation of stack-splitting described in [11] achieves on many benchmarks 
better speedups than traditional stack copying. The ability to reuse the same 
technology on both SMMs and DMPs is also a key to development of Prolog 
systems on Clusters of SMMs, i.e., distributed systems with SMMs as nodes. 



2.3 Incremental Stack-Copying 

Traditional stack-copying requires agents which share work to transfer a com- 
plete copy of the data structures representing the status of the computation. 
In the case of a Prolog computation, this may include transferring most of the 
choice-points along with copies of the other data areas (trail, heap, environ- 
ments). Since Prolog computations can make use of large amounts of memory, 
this copying operation can become quite expensive. Existing stack-copying sys- 
tems (e.g., MUSE) have introduced a variation of stack-copying, called Incre- 
mental Stack-Copying [1] which allows to considerably reduce the amount of 
data transferred during a sharing operation. The idea is to transfer only the 
difference between the data areas of the two agents. Incremental stack-copying, 
in a shared-memory context, is relatively simple to realize — the shared frames 
can be used to identify which choice-points are in common and which are not [1]. 

In the rest of the paper we describe a complete implementation of stack- 
splitting on a DMP platform, analyzing in detail how the various problems men- 
tioned earlier have been tackled. In addition to the basic stack-splitting scheme, 
we analyze how stack-splitting can be extended to incorporate incremental copy- 
ing, an optimization which has been deemed essential to achieve speed-ups in 
various classes of benchmarks. The solution we describe has been developed in a 
concrete implementation, realized by modifying the engine of a commercial Pro- 
log system (ALS Prolog) and making use of MPI as communication platform. 
The ALS Prolog engine is based on the Warren Abstract Machine (WAM). 

3 Incremental Stack-Splitting 

During stack-splitting, all WAM data areas, except for the code area, are copied 
from the agent giving work to the idle one. Next, the parallel choice-points 
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are split between the two agents. Blindly copying all the stacks every time an 
agent shares work with another idle agent can be wasteful, since frequently 
the two agents already have parts of the stacks in common due to previous 
copying. We can take advantage of this fact to reduce the amount of copying 
by performing incremental copying. In order to figure out the incremental part 
that only needs to be copied during incremental stack-splitting, parallel choice- 
points will be labeled. The goal of the labeling process is to uniquely identify 
the original “source” of each choice-point (i.e., which agent created it), to allow 
unambiguous detection of copies of common choice-points. 





To perform labeling, each agent maintains a counter. The counter is increased 
by 1 every time the labeling procedure is performed. When a parallel choice-point 
is copied for the first time, a label for it is created. The label is composed of three 
parts: (1) agent rank, (2) counter, and (3) choice-point address. The agent rank 
is the rank (i.e., id) of the agent which created the choice-point. The counter is 
the current value of the labeling counter for the agent generating the labels. The 
choice-point address is the address of the choice-point which is being labeled. 
The labels for the parallel choice-points are recorded in a separate label stack, in 
the order they are created. Also, when a parallel choice-point is removed from the 
stack, its corresponding label is also removed from the label stack (this is actually 
integrated with the variable untrailing mechanism). Initially, the label stack in 
each agent is set to empty. Intuitively, the label stack keeps a record of changes 
done to the stacks since the last stack-splitting operation. Let us illustrate the 
stack-splitting accompanied by labeling with an example. Suppose process A 
has just created two parallel choice-points and process B is idle. Process A first 
creates labels for its two parallel choice-points. These labels have their rank and 
counter parts as A:l. Process A pushes these labels into its label stack (Fig. 2). 

Process B gets all the parallel choice-points of process A along with process A 
label stack. Then, stack-splitting takes place: process A will keep the alternative 
b2 but not a2, and process B will keep the alternative a2 but not b2. We have 
designed a new WAM scheduling instruction which is placed in the next alter- 
native field of the choice-point above which there is no more parallel work. This 
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scheduling instruction implements the scheduling scheme described in Section 
4. To avoid taking the original alternative of a choice-point, we change its next 
alternative field to WAM instruction trust-fail. See Fig. 3. Afterwards, process 
B backtracks, removes choice-point b along with its corresponding label in the 
label stack, and then takes alternative a2 of choice-point a. 



3.1 Incremental Stack-Splitting: The Procedure 

Assume process W is giving work to process I. Process W will label all its parallel 
choice-points which have not been labeled before and will push them into its label 
stack. If process I label stack is empty, then non-incremental stack-copying will 
need to be performed followed by stack-splitting. Process W sends its complete 
choice-point stack and its complete label stack to process I. Then stack-splitting 
is performed on all the parallel choice-points of process W. However, if process 
I label stack is not empty then process I sends its label stack to process W. 
Process W compares its label stack against the label stack from I. The objective 
is to find the last choice-point ch with a common label. In this way, processes 
W and I are guaranteed to have the same computation above the choice-point 
ch, while their computations will be different below such choice-point. 




Fig. 4. Labels Comparison 



Fig. 5. Proc. A Gave Work to Proc. C 



If the choice-point ch does not exist, then non-incremental stack-copying will 
need to be performed followed by stack-splitting, as described before. However, 
if choice-point ch does exist, then process I backtracks to choice-point ch, and 
performs incremental-copying. Process W sends its choice-point stack starting 
from choice-point ch to the top of its choice-point stack. Process W also sends 
its label stack starting from the label corresponding to ch to the top of its label 
stack. Stack-splitting is then performed on all the parallel choice-points of W. 

We illustrate the above procedure by the following example. Suppose process 
A has three parallel choice-points and process C requests work from A. Process A 
first labels its last two parallel choice-points which have not been labeled before 
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and then increments its counter. Afterwards, process C sends its label stack to 
process A. Process A compares its label stack against the label stack of process 
C and finds the last choice-point ch with a common label. Above choice-point 
ch, the Prolog trees of processes A and C are equal. See Fig. 4. Now, process 
C backtracks to choice-point ch. Incremental stack-copying can then take place. 
Process A sends its choice-point stack starting from choice-point ch to the top 
of its choice-point stack, and stack-splitting is performed (Fig. 5). 



3.2 Incremental Stack-Splitting: Challenges 

Sequential Choice-points: The first issue has to do with sequential choice- 
points that are located among the parallel choice-points shared by two agents. If 
the alternatives of these choice-points are kept in both processes, we may have 
repeated or wrong computations. Hence, the alternatives of these choice-points 
should only be kept in one process (e.g., the one giving work). If the alternatives 
are kept in the process giving work, then the process that is receiving work 
should change the next alternative field of these choice-points to the instruction 
trust-fail to avoid taking the original alternatives of these choice-points. 
Installation Process: The second issue has to do with the bindings of con- 
ditional variables (i.e., variables that may be bound differently in different or- 
parallel branches) which may not be copied during the incremental splitting 
process. This can be fixed by having the process giving work create a stack of 
all these conditional variables along with their bindings. This stack will then be 
sent to the process receiving work so that it can update the bindings. 

Garbage Collection: When garbage collection takes place, relocation of choice- 
points may also occur. Hence, the labels in our label stack may no longer label 
the correct parallel choice-points. Therefore, we need to modify our labeling 
procedure so that when garbage collection on an agent takes place, the label 
stack of this agent is invalidated. The next time this process gives work, non- 
incremental stack-copying will have to take place. This solution is analogous to 
the one adopted in the original implementation of the MUSE system [1]. 





Fig. 6. Copy Nextclause from first cp to ch 



Fig. 7. C Received Next-Clause Fields 
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Next Clause Fields: The fourth issue arises when the next clause fields of 
the parallel choice-points between the first parallel choice-point first cp and the 
last choice-point ch with a common label in the agent giving work are not the 
same compared to the ones in the agent receiving work. This situation occurs 
after several copying and splitting operations. In this case, we cannot just copy 
the part of the choice-point stack between choice-point ch and the top of the 
stack and then perform the splitting. This is because the splitting will not be 
performed correctly. For example, suppose that in our previous example when 
process C requests work from process A, we have this situation (Fig. 6). We can 
see that choice-point g should be given to process C. But process C does not 
have the right next clause field for this choice-point. The problem can be solved 
by having the process giving work send all the next clause fields between its first 
choice-point first cp and choice-point ch to the process receiving work. Then the 
splitting of all parallel choice-points can take place correctly. See Fig. 7. 

4 Scheduling 

The main objective of a scheduling strategy is to balance the amount of parallel 
work done by different agents. Additionally, work distribution among agents 
should be done with minimal communication overhead. These two goals are 
somewhat at odds with each other, since achieving perfect balance may result in 
a very complex scheduling strategy with considerable communication overhead, 
while a simple scheduling strategy which re-distributes work less often will incur 
low communication overhead but poor balancing of work. Therefore, it is obvious 
that there is an intrinsic contradiction between distributing parallel work as even 
as possible and minimizing the distribution overhead. Thus our main goal is to 
find a trade-off point that results in a reasonable scheduling strategy. 

We adopt a simple distributed algorithm to implement a scheduling strategy 
in PALS. A data structure — the load vector — is introduced to indicate the work 
loads of different agents. The work load of an agent is approximated by the 
number of parallel choice-points present in its local computation tree. Each agent 
keeps a work load vector V in its local memory, and the value of V[i] represents 
the work load of the agent with rank i. Based on the work load vector, an idle 
agent can request parallel work from other agent with the greatest work load, 
so that parallel work can be fairly distributed. The load vector is updated at 
runtime. When stack-splitting is performed, a Load_Info message with updated 
load information will be broadcasted to all the agents so that each agent has 
the latest information of work load distribution. Additionally, load information 
is attached with each incoming message. For example: when a Request_Work 
message is received from agent Pi, the value of Pi’s work load, 0, can be inferred. 

Based on its work load each agent can be in one of two states: scheduling 
state or running state. An agent that is running, occasionally checks whether 
there are incoming messages. Two possible types of messages are checked by the 
running agent: one is Request_Work message sent by an idle agent, and the other 
is Send_Load_Inf o message, which is sent when stack-splitting occurs. The idle 
agent in scheduling state is also called a scheduling agent. 
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The distributed scheduling algorithm mainly consists of two parts: one is 
for the scheduling agent, and the other is for the running agent. An idle agent 
wants to get work as soon as possible from another agent, preferably the one 
that has the largest amount of work. The scheduling agent searches through 
its local load vector for the agent with the greatest work load, and then sends 
a Request_Work message to that agent asking for work. If all the other agents 
have no work, then the execution of the current query is finished and the agent 
halts. When a running agent receives a Request_Work message, stack-splitting 
will be performed if the running agent’s work load is greater than the splitting 
threshold, otherwise, a Reply_Without_Work message with a positive work load 
value will be sent as a reply. If a scheduling agent receives a Request_Work 
message, a Reply_Without_Work message with work load 0 will be sent as a 
reply. The running agent’s algorithm can be briefly described as follows: each 
incoming message can be either a Send_LoadInf o message — i.e., a notification 
of a change in load for some processors — or a Request_Work message — i.e., a 
request for sharing, which is accepted if the local load is above a given threshold. 
At fixed time intervals (which can be selected at initialization of the system) the 
agent examines the content of its message queue for eventual pending messages. 
Send_LoadInf o messages are quickly processed to update the local view of the 
overall load in the system. Messages of the type Request _Work are handled as 
described above. Observe that the concrete implementation actually checks for 
the presence of the two types of messages with different frequency (i.e., request 
for work messages are considered less frequently than requests for load update). 

5 Implementation and Performance 

Stack- Splitting: The stack-splitting procedure has been implemented by mod- 
ifying the commercial ALS Prolog system, using the MPI library for message 
passing. The only major data structures added to the ALS system are: the la- 
bel stack, the load vector, and buffers in order transfer information. The whole 
system runs on a truly distributed machine (a network of 32 Pentium II nodes 
connected by Myrinet-SAN Switches). All communication — during scheduling, 
copying, splitting, etc. — is done using explicit message passing via MPI. 

The benchmarks used to test our system are standard benchmarks drawn 
from the pool of programs frequently used to evaluate OP systems (e.g.. Queens, 
Knight, Solitaire). The benchmarks selected are simple but provide sufficiently 
different program structures to validate the parallel engine. The timing results 
in seconds from our incremental stack-splitting system are presented in Table 1. 
The modifications made to the ALS WAM are very localized and reduced to the 
minimum. This has allowed us to keep a clean design — that can be easily ported 
to other WAM-based implementations — and to contain the parallel overhead — 
our engine on a single processor is on average 5% slower than ALS WAM. The 
corresponding speed-ups are presented in Fig. 8 (with label incremental). 

Note that for benchmarks with substantial running time the speed-ups are 
quite good, while for programs with smaller running time the speed-ups deteri- 
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Table 1. Timings for Incremental Stack-Splitting (Time in sec.) 



Benchmark 




Processors 








1 


2 


4 


8 


16 


32 


Knight 


159.950 


81.615 


40.929 


20.754 


10.939 


8.248 


Send More 


61.817 


32.953 


17.317 


8.931 


4.923 


3.916 


8 Puzzle 


27.810 


15.387 


8.442 


10.522 


3.128 


5.940 


Solitaire 


5.909 


3.538 


1.811 


1.003 


0.628 


0.535 


10 Queens 


4.572 


2.418 


1.380 


0.821 


1.043 


0.905 


Hamilton 


3.175 


1.807 


0.952 


0.610 


0.458 


0.486 


Map Coloring 


1.113 


0.702 


0.430 


0.319 


0.318 


0.348 


8 Queens 


0.185 


0.162 


0.166 


0.208 


0.169 


0.180 



orate. This is consistent with our belief that DMP implementations should be 
used for parallelizing programs with coarse-grained parallelism. For programs 
with small running times, there is not enough work to offset the communication 
costs on DMPs. Nevertheless, our system is reasonably efficient, given that even 
for small benchmarks it can produce speed-ups. It is also interesting to observe 
that in no cases we have observed slow-downs due to parallel execution — thanks 
to simple granularity control mechanisms embedded in the scheduler. For some 
benchmarks the speedup graphs are somewhat irregular - specially the 8 Puzzle. 
We believe that the reason behind this hides in the scheduling strategy used. 

One of the objectives of the experiments performed is to validate the effec- 
tiveness of incremental stack-splitting for efficient exploitation of parallelism on 
DMPs. In particular, there are two aspects that we were interested in exploring: 
(i) verifying the effectiveness of stack-splitting versus a more “direct” imple- 
mentation of stack-copying (i.e., keeping single copies of choice-points around 
the system); (ii) verifying the impact of incremental splitting. Validity of stack- 
splitting vs. stack-copying can be inferred from the experiments described in the 
next subsection: a direct implementation of stack-copying would produce the 
same amount of communication traffic as some of the variations of scheduling 
tested, and thus incur the same kind of problems described next. In order to 
evaluate the impact of incrementality, we have measured the performance of the 
system on the selected benchmarks without the use of incremental splitting — 
i.e., each time a sharing operation takes place, a complete copy of the data areas 
is performed. The results obtained from this experiment are in Fig. 8: the figure 
compares the speed-ups observed with and without incremental copying. We can 
observe that incremental stack-splitting obtains higher speed-ups than the non- 
incremental stack-copying. The difference is more significant in benchmarks with 
a large number of choice-points, where incrementality is applied more frequently. 
Scheduling: One of the major reasons to adopt stack-splitting is the ability 
to perform scheduling on bottom-most choice-point. Other DMP implementa- 
tions of OP have resorted to scheduling on the top-most choice-point, where only 
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the oldest choice-point with unexplored alternatives is exchanged between pro- 
cessors. Top-most .scheduling will share only one choice-point at the time, thus 
relieving the engine from the need of controlling access to shared choice-points. 

To validate the effectiveness of our claim, we have developed a top-most 
scheduler lor our system and compared its performance with that of the incre- 
mental stack-splitting with bottom-most scheduling. Fig. 9 compares the speed- 
ups observed using the two diflforont schedulers. In the figme we have reported 
the behavior only of those benclunarks where slgnilicant differences in perfor- 
mance have been recorded. In all other benchmarks, top-most and bottom-most 
scheduling provide similar results, as a small number of choice-points are cre- 
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Fig. 9. Incremental Stack-Splitting vs. Top Most Scheduling 
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ated and only one at a time is shared between processors. As we can observe 
from Fig. 9, bottom-most scheduling provides a sustained speed-up considerably 
higher than top-most scheduling. This is due to the reduced number of calls to 
the scheduler performed during the execution — processors spend a higher frac- 
tion of their time doing useful work compared to scheduling on top-most. 

Another aspect of our implementation that we are interested in validating 
is the performance of the distributed scheduler. As mentioned in Sect. 4, our 
scheduler is based on keeping in each processor an “approximated” view of the 
load in each other processor. The risk that this method may encounter is that 
a processor may have out-of-date information concerning the load in other pro- 
cessors, and as a consequence it may try to request work from idle processors or 
ignore processors that may have unexplored alternatives. Fig. 10 provides some 
information concerning the number of attempts that a processor needs to per- 
form before receiving work. The figure on the left measures the average number 
of requests that a processor has to send; as we can see, the number is very small 
(1 or 2 requests are typically sufficient) and such number is generally better 
if we adopt bottom-most scheduling. The figure on the right shows the maxi- 
mum number of requests observed; these numbers tend to grow towards the end 
of the computation (when less work is available) — nevertheless, typically only 
one or two processors achieve these maximum values, while the majority of the 
processors remain close to the average number of attempts. 

To further validate our scheduling approach, we have compared it with an 
alternative scheduling scheme developed in PALS. This alternative scheme is 
an implementation of a centralized scheduling algorithm, designed following the 
guidelines of the scheduler used in Opera [7]. In the centralized approach, only 
one processor, called central^ is in charge of keeping track of the load information. 
Idle processors send their requests for work directly to the central processor. In 
turn, the central processor is in charge of implementing a matchmaking algo- 
rithm between idle and busy processors. When stack-splitting occurs, only the 
central processor is informed about the load information update. Fig. 11 com- 
pares the speed-ups achieved using centralized scheduling with the speed-ups 
observed using the distributed scheduling approach.^ As evident from the figure, 
the speed-ups observed in centralized scheduling are almost negligible — this is 
due to the inability of the scheduling method to promptly respond to the re- 
quests for new work. Also, the use of a reasonably fast network (Myrinet) leads 
to the creation of a severe bottleneck at the level of the centralized scheduler. 



Number of Requests Before Getting Work 




Number of Requests Before Getting Work 











J 
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Fig. 10. Average and Maximum Number of Tries to Acquire Work 

^ We had to limit the experiments to a smaller number of CPUs due to unavailability 
of half of the machine at that time. 
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Fig. 11. Incremental Stack-Splitting vs. Centralized Scheduling 



The re.siilts presented in [5j .sugge.st that random .selection of work may also 
provide a simple and effective alternative when searching for work. We have 
experimented with tliis idea, by modifying the scheduler to select any busy pro- 
cessor for scheduling. The idea is to avoid bottleneck situations where multiple 
idle processors are concentrating their requests for work towards the same busy 
processor. We have named this new version of the scheduler Random Scheduler. 
In this version, an idle processor searches its load vector for the next processor 
with load greater than a given small threshold. Fig. 12 compares the .speed-ups 
observed in the Random scheduler with those from the standard bottom-most 
scheduling with selection of processor with highest load. The results indicate that 
the Random scheduler is less effective. This suggests that selecting work from 
the processor with highest load is not a severe bottleneck and sending requests 
to lightly loaded processors may increase the number of calls to the sclicduler. 
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Fig. 12. Incremental Stack-Splitting vs. Random Scheduling 
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6 Related Work and Conclusions 

In this paper we proposed a novel scheme to implement incremental stack- 
splitting for OP on DMPs. The novel method allows to take advantage of 
the higher locality and independence of computation threads allowed by stack- 
splitting, without losing the advantages of incremental copying. The incremental 
stack-splitting scheme presented is based on a procedure which labels parallel 
choice-points and then compares the labels to determine the incremental WAM 
areas to be copied. Furthermore, we described a scheduling strategy for incre- 
mental stack-splitting. The incremental stack-splitting scheme and the schedul- 
ing strategy have been implemented in the ALS Prolog system, and performance 
results from this implementation were reported. To our knowledge, PALS is the 
first ever or-parallel implementation of Prolog on Beowulf systems. 

A relatively small number of proposals can be found in the literature dealing 
with execution of Prolog on DMPs. Some of the existing environment representa- 
tion models proposed (e.g., Conery’s Closed Environments) have been designed 
with distributed memory in mind, but they have never been concretized in ac- 
tual implementations. Most of the older systems implemented on DMPs [7,6] 
are based on stack copying and have been designed with respect to a specialized 
architecture (Transputers). Their schedulers are tailored for this class of archi- 
tectures and they all resort to top scheduling to reduce communication costs. 
PDP [3] makes use of a recomputation approach to deal with OP, and has also 
been developed on Transputers. MUSE version on switch based multiprocessors 
[2] (e.g.. Butterfly) gives good speedups for very coarse grain applications but 
uses distributed shared-memory techniques. Only in recent years a renovated 
effort towards developing models for generic DMP architectures have emerged. 
These include DAOS [8] and Dorpp [18] based on variations of the binding arrays 
method and relying on distributed shared-memory technology; DAOS has not 
reported any implementation result, while Dorpp has been executed on simu- 
lators (with fairly good results). In contrast to DAOS and Dorpp, we opted to 
continue using stack copying with a fully distributed scheduler. For comparison 
of stack-splitting with other existing approaches see [11]. 
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Abstract. Tabling is an implementation technique that improves the 
declarativeness and expressiveness of Prolog by reusing solutions to goals. 
Quite a few interesting applications of tabling have been developed in the 
last few years, and several are by nature non-deterministic. This raises 
the question of whether parallel search techniques can be used to improve 
the performance of tabled applications. 

In this work we demonstrate that the mechanisms proposed to parallelize 
search in the context of SLD resolution naturally generalize to parallel 
tabled computations, and that resulting systems can achieve good per- 
formance on multi-processors. To do so, we present the OPTYap par- 
allel engine. In our system individual SLG engines communicate data 
through stack copying. Completion is detected through a novel parallel 
completion algorithm that builds upon the data structures proposed for 
or-parallelism. Scheduling is simplified by building on previous research 
on or-parallelism. We show initial performance results for our implemen- 
tation. Our best result is for an actual application, model checking, where 
we obtain linear speedups. 

Keywords: Parallel Logic Programming, Or-Parallelism, Tabling. 



1 Introduction 

The past years have seen wide effort at increasing Prolog’s declarativeness and 
expressiveness. Tabling or memoing is one such proposal that has been gaining 
in popularity. In a nutshell, tabling consists of storing intermediate answers for 
subgoals so that they can be reused when a repeated subgoal appears. Work on 
SLG resolution [3], as implemented in the XSB System [15], proved the viability 
of tabling technology for application areas such as natural language processing, 
knowledge based systems, model checking, or program analysis. Tabling based 
models are able to reduce the search space, avoid looping, and always terminate 
for programs with the hounded term-size property [4]. 
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Tabling works for both deterministic and non-deterministic applications, but 
it has frequently been used to reduce search space. This rises the question of 
whether further efficiency improvements may be achievable through parallelism. 
Freire and colleagues [7] were the first to propose that tabled goals could in- 
deed be a source of implicit parallelism. In their model, each tabled subgoal is 
computed independently in a separate computational thread, a generator thread. 
Each generator thread is the sole responsible for fully exploiting its subgoal and 
obtain the complete set of answers. This model restricts parallelism to concur- 
rent execution of generator threads. Parallelism arising from non-tabled subgoals 
or from alternative clauses is not exploited. 

Our suggestion is that we should exploit parallelism from both tabled and 
non-tabled subgoals. By doing so we can both extract more parallelism, and 
reuse the mature technology for tabling and parallelism. Towards this goal, 
we previously proposed two computational models to combine tabling with or- 
parallelism [12], Or- Parallelism within Tabling (OPT) and Tabling within Or- 
Parallelism (TOP) models. 

This paper presents an implementation for the OPT model, the OPTYap 
system. To the best of our knowledge, OPTYap is the first available system 
that can exploit parallelism from tabled programs. The OPT model considers 
tabling as the base component of the system. Each computational worker behaves 
as a full sequential tabling engine. The or-parallel component of the system is 
triggered to allow synchronized access to the shared part of the search space or 
to schedule work. 

From the beginning, we aimed at developing an or-parallel tabling system 
that, when executed with a single worker, runs as fast or faster than current 
sequential tabling systems as otherwise, parallel performance would not be sig- 
nificant and fair. To achieve these goals, OPTYap builds on YapOr [13] and 
YapTab [14] engines. YapOr is an or-parallel engine that extends Yap’s efficient 
sequential engine [16]. It is based on the environment copy model, as first imple- 
mented in Muse [1]. YapTab is a sequential tabling engine that extends Yap’s ex- 
ecution model to support tabled evaluation. YapTab’s implementation is largely 
based on the ground-breaking SLG-WAM work used in the XSB system [15]. 

The remainder of the paper is organized as follows. First, we briefly introduce 
the basic tabling definitions and the SLG-WAM. Next, we present the OPT com- 
putational model and discuss its implementation framework. We then present 
the new data areas, data structures and algorithms to extend the Yap Prolog 
system to support sequential and parallel tabling. Last, we present some early 
performance data and terminate by outlining some conclusions and further work. 



2 Tabling and the SLG-WAM 

Tabling is about storing and reusing intermediate answers for goals. In variant- 
based tabling, whenever a tabled subgoal S is called for the first time, an entry for 
S is allocated in the table space. This entry will collect all the answers found for 
S. Repeated calls to variants of S are resolved by consuming the answers already 
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stored in the table. Meanwhile, as new answers are generated, they are inserted 
into the table and returned to all variant subgoals. Within this model, the nodes 
in the search space are classified as either generator nodes, corresponding to first 
calls to tabled subgoals, consumer nodes, corresponding to variant calls to tabled 
subgoals, and interior nodes, corresponding to non-tabled predicates. 

Tabling based evaluation has four main types of operations for definite pro- 
grams. The Tabled Subgoal Call operation checks if the subgoal is in the table 
and if not, inserts it and allocates a new generator node. Otherwise, allocates 
a consumer node and starts consuming the available answers. The New Answer 
operation verifies whether a newly generated answer is already in the table, and 
if not, inserts it. The Answer Resolution operation consumes the next newly 
found answer, if any. The Completion operation determines whether a tabled 
subgoal is completely evaluated, and if not, schedules a possible resolution to 
continue the execution. 

Space for a subgoal can be reclaimed when the subgoal has been completely 
evaluated. A subgoal is said to be completely evaluated when all its possible 
resolutions have been performed, that is, when no more answers can be generated 
and the variant subgoals have consumed all the available answers. Note that a 
number of subgoals may be mutually dependent, forming a strongly connected 
component (or SCC) [15], and therefore can only be completed together. The 
completion operation is thus performed at the leader of the SCC, that is, by the 
oldest subgoal in the SCC, when all possible resolutions have been made for all 
subgoals in the SCC. Hence, in order to efficiently evaluate programs one needs 
an efficient and dynamic detection scheme to determine when all the subgoals 
in a SCC have been completely evaluated. 

The implementation of tabling in XSB Prolog was attained by extending the 
WAM [17] into the SLG-WAM [15]. In short, the SLG-WAM introduces a new 
set of instructions to deal with the operations above, a special mechanism to 
allow suspension and resumption of computations, and two new memory areas: 
a table space, used to save the answers for tabled subgoals; and a completion 
stack, used to detect when a set of subgoals is completely evaluated. The SLG- 
WAM also introduced the concepts of freeze registers and forward trail to handle 
suspension [15]. 

3 Or-Parallelism within Tabling 

The OPT model [12] divides the search tree into a public and several private 
regions, one per worker. Workers in their private region execute nearly as in 
sequential tabling. Workers exploiting the public region of the search tree must 
be able to synchronize in order to ensure the correctness of the tabling opera- 
tions. When a worker runs out of alternatives to exploit, it enters in scheduling 
mode. The YapOr scheduler is used to search for busy workers with unexploited 
work. Alternatives should be made available for parallel execution, regardless of 
whether they originate from generator, consumer or interior nodes. 
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Parallel execution requires significant changes to the SLG-WAM. Synchro- 
nization is required (i) when backtracking to public generator or interior nodes 
to take the next available alternative; (ii) when backtracking to public consumer 
nodes to take the next unconsumed answer; or, (iii) when inserting new answers 
into the table space. In a parallel tabling system, the relative positions of gen- 
erator and consumer nodes are not as clear as for sequential systems. Hence 
we need novel algorithms to determine whether a node is a leader node and to 
determine whether a SCC can be completed. 

OPTYap uses environment copying for or-parallelism and the SLG-WAM for 
tabling because these are, respectively, two of the most successful or-parallel and 
tabling engines. Gopying is a popular and effective approach to or-parallelism 
that minimizes actual changes to the WAM. To share work we use incremental 
copying [1], that is, we only copy differences between stacks. 

In contrast to copying, the SLG-WAM requires significant changes to the 
WAM in order to support freezing of goals. These changes introduce overheads, 
namely in trailing and in stack manipulation. Demoen and Sagonas addressed the 
problems by suggesting GAT [5] and more recently, GHAT [6] . These two models 
reduce overheads by copying parts of stacks, instead of freezing. Although there 
is an attractive analogy between copying and GAT or GHAT, a more detailed 
analysis shows significant drawbacks. First, both assume separate choice-point 
and local stacks. Second, both rely on an incremental saving technique to reduce 
copying overheads. Unfortunately, the technique assumes that completion always 
takes place at generator nodes. As we shall see, these assumptions do not hold 
true for parallel tabling. Last, both may incur in substantial slowdowns for some 
applications. We therefore used the SLG-WAM in our work. 

Rather different approaches to tabling have also been proposed recently [18,9]. 
In both cases, the main idea is to recompute tabled goals, instead of suspending. 
Unfortunately, the process of retrying alternatives may cause redundant recom- 
putations of non-tabled subgoals that appear in the body of a looping alternative 
and redundant consumption of answers if the looping alternative contains more 
than one variant subgoal call. Parallel recomputation is harder because we do 
not know beforehand if a tabled alternative needs to be recomputed: a conser- 
vative approach may lose parallelism, and an optimistic approach may lead to 
even more redundant computation. 



4 The Sequential Tabling Engine 

Next, we review the main principles of the YapTab design (please refer to [14,11] 
for more details). YapTab implements two tabling scheduling strategies, batched 
and local [8], and in our initial design it only considers positive programs. Tables 
are implemented using tries as proposed in [10]. We reconsidered decisions in 
the original SLG-WAM that can be a potential source of parallel overheads. 
Namely, YapTab considers that control of leader detection and scheduling of 
unconsumed answers should be performed through the consumer nodes. Hence, 
YapTab associates a new data structure, the dependency frame, to consumer 
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nodes. In contrast, the SLG-WAM associates this control with generator nodes. 
We argue that managing dependencies at the level of the consumer nodes is a 
more intuitive approach that we can take advantage of. 

The introduction of this new data structure allows us to reduce the number 
of extra fields in tabled choice points and to eliminate the need for a separate 
completion stack. Furthermore, allocating the data-structure in a separate area 
simplifies the implementation of parallelism. 

To benefit from the philosophy behind the dependency frame data structure, 
we redesigned the algorithms related with suspension, resumption and comple- 
tion. We next present YapTab’s main data structures and algorithms. We assume 
a batched scheduling strategy implementation [8] (please refer to [11] for the im- 
plementation of local scheduling) . 

Generator and Consumer Nodes. YapTab implementation stores generator nodes 
as standard nodes plus a pointer to the corresponding subgoal frame. In contrast 
to the SLG-WAM, we adjust the freeze registers by using the top of stack values 
kept in the consumer choice points. YapTab also implements consumer nodes as 
standard nodes plus a pointer to a dependency frame. The dependency frames 
are linked together to form the dependency list of consumer nodes. Additionally, 
dependency frames store information to efficiently check for completion points, 
replacing the need for a separate completion stack [15], as we discuss next. 

Completion and Leader Nodes. The completion operation takes place when a 
generator node exhausts all alternatives and finds itself as a leader node. We 
designed novel algorithms to quickly determine whether a generator node is a 
leader node. 

Our key idea is that each dependency frame holds a pointer to the presumed 
leader node of its SGG, and that the youngest consumer node always knows 
the leader for the current SGG. Hence, our leader node algorithm must always 
compute leader node information when first creating a new consumer node, say 
C. To do so, we first hypothesize that the current leader node is C’s generator 
node, say Q. Next, for all consumer nodes between C and G, we check whether 
they depend on an older generator node. Gonsider that the oldest dependency 
is for Q'. If this is the case, then Q' is the leader node, otherwise our hypothesis 
was correct and the leader is indeed Q. 

Whenever we backtrack to a generator that it also the current leader node, 
we must check whether there are younger consumer nodes with unconsumed 
answers. This is implemented by going through the chain of dependency frames 
looking for a frame with unconsumed answers. If there is such a frame, we resume 
the computation to the corresponding consumer node. Otherwise, we perform 
completion. Gompletion includes (i) marking all the subgoals in the SGG as 
completed; (ii) deallocating all younger dependency frames; (iii) adjusting the 
freeze registers; and (iv) backtracking to the previous node to continue the 
execution. 
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Answer Resolution. Answer resolution has to be performed whenever the com- 
putation fails and is resumed at a consumer choice point. The implementation 
must guarantee that every answer is consumed once and just once. First, we 
check the table space for unconsumed answers for the subgoal at hand. If there 
are new answers, we load the next available answer and proceed with execution. 
Otherwise, we schedule for a backtracking node. If this is the first time that back- 
tracking from that consumer node takes place, then it is performed as usual to 
the previous node. Otherwise, we know that the computation has been resumed 
from an older generator node Q during an unsuccessful completion operation. 
Therefore, backtracking must be done to the next consumer node that has un- 
consumed answers and that is younger than Q. If there are no such consumer 
nodes then backtracking must be done to the generator node Q. 

5 The Or-Parallel Tabling Engine 

The OPTYap engine is based on the YapTab engine. However, new data struc- 
tures and algorithms were required to support parallel execution. Next, we de- 
scribe the main design and implementation decisions. 

Memory Management. The efficiency of a parallel system largely depends on 
how concurrent handling of shared data is achieved and synchronized. Page 
faults and memory cache misses are a major source of overhead regarding data 
access or update in parallel systems. OPTYap tries to avoid these overheads 
by adopting a page-based organization scheme to split memory among different 
data structures, in a way similar to Bonwick’s Slab memory allocator [2]. 

Our experience showed that the table space is a key data area open to con- 
current access operations in a parallel tabling environment. To maximize paral- 
lelism, whilst minimizing overheads, accessing and updating the table space must 
be carefully controlled. Read/write locks are the ideal implementation scheme 
for this purpose. OPTYap implements four alternative locking schemes to deal 
with concurrent accesses to the table data structures. Our results suggested that 
concurrent table access is best handled by schemes that lock table data only 
when writing to the table is likely. 

Leader Nodes. Or-parallel systems execute alternatives early. As a result, it 
is possible that generators will execute earlier, and in a different branch than 
in sequential execution. In the worst case, different workers may execute the 
generator and the consumer goals. Workers may have consumer nodes while 
not having the corresponding generators in their branches. Or, the owner of a 
generator node may have consumers being executed by several different workers. 
This may induce complex dependencies between workers, hence requiring a more 
elaborate completion operation that may involve branches created by several 
workers. 

OPTYap allows completion to take place at any node, not only at generator 
nodes. In order to allow a very flexible completion algorithm we introduce a new 
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concept, the generator dependency node (or GDN). Its purpose is to signal the 
nodes that are candidates to be leader nodes, therefore representing a similar 
role as that of the generator nodes for sequential tabling. The GDN is calculated 
whenever a new consumer node, say C, is allocated. It is defined as the youngest 
node T> on the current branch of C, that is an ancestor of the generator node 
Q for C. Figure I presents three different situations that better illustrate the 
GDN concept. WQ is the worker that allocated the generator node Q, WC is the 
worker that is allocating a consumer node C, and the node pointed by the black 
arrow is the GDN for the new consumer. 



(a) (b) (c) 




WG WC 



Fig. 1. Spotting the generator dependency node. 

In situation (a), the generator node Q is on C’s branch, and thus, Q is the 
GDN. In situation (b), nodes M\ and A/2 are on C’s branch, and both contain 
a branch leading to Q. As A/2 is the youngest node of both, it is the GDN. In 
situation (c), J\f\ is the unique node that belongs to C’s branch and that also 
contains Q in a, branch below. A /2 contains ^ in a branch below, but it is not 
on C’s branch, while A/3 is on C’s branch, but it does not contain Q in a branch 
below. Therefore, Afi is the GDN. Notice that in both cases (b) and (c) the 
GDN can be a generator, a consumer or an interior node. 

The procedure to compute the leader node information when allocating a 
dependency frame for a new consumer node now hypothesizes that the leader 
node for the consumer node at hand is its GDN, and not its generator node. 

The Control Flow. OPTYap’s execution control mainly flows through four pro- 
cedures. The process of completely evaluating SGGs is accomplished by the 
completionO and cUiswer_resolution() procedures, while parallel synchro- 
nization is achieved by the getworkO and schedulerO procedures. Here we 
focus on the flow of control in engine mode, that is on the completionO, 
answer_resolution() and getworkO procedures, and discuss scheduling later. 
Figure 2 presents a general overview of how control flows between the three 
procedures and how it flows within each procedure. 





Fig. 2. The flow of control in a parallel tabled evaluation. 



Public Completion. Different paths may be followed when a worker W reaches a 
leader node for a SCC S. The simplest case is when the node is private. In this 
case, we proceed as for sequential tabling. Otherwise, the node is public, and 
there may exist dependencies on branches explored by other workers. Therefore, 
even when all younger consumer nodes on W’s stacks do not have unconsumed 
answers, completion cannot be performed. The reason for this is that the other 
workers can still influence S. For instance, these workers may find new answers 
for a consumer node in S, in which case the consumer must be resumed to 
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consume the new answers. As a result, in order to allow W to continue execution 
it becomes necessary to suspend the SCC at hand. 

Suspending in this context is obviously different from suspending consumer 
nodes. Consumer nodes are suspended due to tabled evaluation. SCCs are sus- 
pended due to or-parallel execution. Suspending a SCC includes saving the SCC’s 
stacks to a proper space, leaving in the leader node a reference to where the 
stacks were saved, and readjusting the freeze registers and the stack and frame 
pointers. If the worker did not suspend the SCC, hence not saving the stacks, 
any future sharing work operation might damage the SCC’s stacks and therefore 
make delayed completion unworkable. 

To deal with the new particularities arising with concurrent evaluation a novel 
completion procedure, public_completion() , implements completion detection 
for public leader nodes. As for private nodes, whenever a public node finds that 
it is a leader, it starts to check for younger consumer nodes with unconsumed 
answers. If there is such a node, we resume the computation to it. Otherwise, it 
checks for suspended SCCs in the scope of its SCC. A suspended SCC should be 
resumed if it contains consumer nodes with unconsumed answers. To resume a 
suspended SCC a worker needs to copy the saved stacks to the correct position 
in its own stacks, and thus, it has to suspend its current SCC first. 

We thus adopted the strategy of resuming suspended SCCs only when the 
worker finds itself at a leader node, since this is a decision point where the worker 
either completes or suspends the current SCC. Hence, if the worker resumes a 
suspended SCC it does not introduce further dependencies. This is not the case 
if the worker would resume a suspended SCC TZ as soon as it reached the node 
where it had suspended. In that situation, the worker would have to suspend its 
current SCC S, and after resuming TZ it would probably have to also resume S 
to continue its execution. A first disadvantage is that the worker would have to 
make more suspensions and resumptions. Moreover, if we resume earlier, TZ may 
include consumer nodes with unconsumed answers that are common with S. 
More importantly, suspending in non-leader nodes leads to further complexity. 
Answers can be found in upper branches for suspensions made in lower nodes, 
and this can be very difficult to manage. 

A SCC S is completely evaluated when (i) there are no unconsumed answers 
in any consumer node in its scope, that is, in any consumer node belonging to 
S or in any consumer node within a SCC suspended in a node belonging to S; 
and (ii) there is only a single worker owning its leader node C. We say that a 
worker owns a node Af when it holds Af on its stacks (this is true even if Af 
is not the worker’s current branch). Completing a SCC includes (i) marking 
all dependent subgoals as complete; (ii) releasing the frames belonging to the 
complete branches, including the branches in suspended SCCs; (iii) releasing 
the frozen stacks and the memory space used to hold the stacks from suspended 
SCCs; and (iv) readjusting the freeze registers and the whole set of stack and 
frame pointers. 

Our public completion algorithm has two major advantages. One is that the 
worker checking for completion determines if its current SCC is completely eval- 
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uated or not without requiring any explicit communication or synchronization 
with other workers. The other is that it uses the SCC as the unit for suspension. 
This latter advantage is very important since it simplifies the management of 
dependencies arising from branches not on stack. A leader node determines the 
position from where dependencies may exist in younger branches. As a suspen- 
sion unit includes the whole SCC and suspension only occurs in leader node 
positions, we can simply use the leader node to represent the whole scope of a 
suspended SCC, and therefore simplify its management. 



Answer Resolution. The answer resolution operation for the parallel environment 
essentially uses the same algorithm as previously described for private nodes. 

Getwork. The last flow control procedure. It contributes to the progress of a 
parallel tabled evaluation by moving to effective work. The usual way to execute 
getwork 0 is through failure to the youngest public node on the current branch. 
We can distinguish two blocks of code in the getwork () procedure. The first 
block detects completion points and therefore makes the computation flow to the 
public_completion() procedure. The second block corresponds to or-parallel 
execution. It synchronizes to check for available alternatives and executes the 
next one, if any. Otherwise, it invokes the scheduler. 

The getwork 0 procedure detects a completion point when Af is the leader 
node pointed by the top dependency frame. The exception is if M is itself a 
generator node for a consumer node within the current SCC and it contains 
unexploited alternatives. In such cases, the current SCC is not fully exploited. 
Hence, we should exploit first the available alternatives, and only then invoke 
completion. 

Scheduling Work. Scheduling work is the scheduler’s task. It is about efficiently 
distributing the available work for exploitation between the running workers. 
In a parallel tabling environment we have the extra constraint of keeping the 
correctness of sequential tabling semantics. A worker enters in scheduling mode 
when it runs out of work and returns to execution whenever a new piece of 
unexploited work is assigned to it by the scheduler. 

The scheduler for the OPTYap engine is mainly based on YapOr’s scheduler. 
All the scheduler strategies implemented for YapOr were used in OPTYap. How- 
ever, extensions were introduced in order to preserve the correctness of tabling 
semantics. These extensions allow support for leader nodes, frozen stack seg- 
ments, and suspended SCCs. The OPTYap model was designed to enclose the 
computation within a SCC until the SCC was suspended or completely evalu- 
ated. Thus, OPTYap introduces the constraint that the computation cannot flow 
outside the current SCC, and workers cannot he scheduled to execute at nodes 
older than their current leader node. Therefore, when scheduling for the nearest 
node with unexploited alternatives, if it is found that the current leader node is 
younger than the potential nearest node with unexploited alternatives, then the 
current leader node is the node scheduled to proceed with the evaluation. 
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Moving In the Tree. The next case is when the process above does not return 
any node to proceed execution. The scheduler then starts searching for busy 
workers that can be requested for work. If such a worker B is found, then the 
requesting worker moves up to the lowest node that is common to B, in order 
to become partially consistent with part of B. Otherwise, no busy worker was 
found, and the scheduler moves the idle worker to a better position in the search 
tree. Therefore, we can enumerate three different situations for a worker to move 
up to a node Af: (i) Af is the nearest node with unexploited alternatives; (ii) Af is 
the lowest node common with the busy worker we found; or (iii) Af corresponds 
to a better position in the search tree. 

The process of moving up in the search tree from a current node Afo to 
a target node Aff is implemented by the move_up_one_node() procedure. This 
procedure is invoked for each node that has to be traversed until reaching Aff. 
The presence of frozen stack segments or the presence of suspended SCCs in 
the nodes being traversed influences and can even abort the usual moving up 
process. 

Assume that the idle worker W is currently positioned at Afi and that it wants 
to move up one node. Initially, the procedure checks for frozen nodes on the stack 
to infer whether W is moving within the SCC. If so, W is simply deleted from 
member of Afi. The interesting case is when W is not within a SCC. If Afi holds 
a suspended SCC, then >V can safely resume it. If resumption does not take 
place, the procedure proceeds to check whether Afi is a consumer node. Being 
this the case, W is deleted from member of Afi and if W is the unique owner 
of Afi then the suspended SCCs in Afi can be completed. Completion can be 
safely performed over the suspended SCCs in Afi not only because the SCCs are 
completely evaluated, as none was previously resumed, but also because no more 
dependencies exist, as there are no more branches below Afi. The reasons given 
to complete the suspended SCCs in Afi hold even if Afi is not a consumer node, 
as long as W is the unique owner of Afi. In such case, if Afi is a generator node 
then its correspondent subgoal can be also marked as completed. Otherwise, W 
is simply deleted from being member and owner of Afi. 

6 Initial Performance Evaluation 

The environment for our experiments consists of a shared memory parallel ma- 
chine, a 200 MHz PentiumPro with 4 processors, 128 MBytes of main memory, 
256 KBytes of cache and running the linux-2.2.12 kernel. The machine was oth- 
erwise idle while benchmarking. 

YapOr, YapTab and OPTYap are based on Yap’s 4.2.1 engine. Note that 
sequential execution would be somewhat better with more recent Yap engines. 
We used the same compilation flags for Yap, YapOr, YapTab and OPTYap. 
Regarding XSB Prolog, we used version 2.3 with the default configuration and 
the default execution parameters (chat engine and batched scheduling). 

Non- Tabled Benchmarks. To put the performance results in perspective we first 
use a common set of non-tabled benchmark programs to evaluate how the original 
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Yap Prolog engine compares against the several Yap extensions and against 
the most well-known tabling engine, XSB Prolog. The benchmarks include the 
n-queens problem, the puzzle and cubes problems from Evan Tick’s book, an 
hamiltonian graph problem and a naive sort algorithm. All benchmarks find all 
solutions for the problem. 

Table 1 shows the base running times, in milliseconds, for Yap, YapOr, 
YapTab, OPTYap and XSB for the set of non-tab led benchmarks. In paren- 
theses, it shows the overhead over the Yap running times. The results indicate 
that YapOr, YapTab and OPTYap introduce, on average, an overhead of about 
6%, 8% and 12% respectively over standard Yap. Regarding XSB, the results 
show that, on average, XSB is 1.9 times slower than Yap, a result mainly due to 
the faster Yap engine. 



Table 1. Running times on non-tabled programs. 



Program 


Yap 


YapOr 


YapTab 


OPTYap 


XSB 


9-queens 


584 


604(1.03) 


605(1.04) 


626(1.07) 


1100(1.88) 


cubes 


170 


170(1.00) 


173(1.02) 


175(1.03) 


329(1.94) 


ham 


371 


402(1.08) 


399(1.08) 


432(1.16) 


659(1.78) 


nsort 


310 


330(1.06) 


328(1.06) 


354(1.14) 


629(2.03) 


puzzle 


1633 


1818(1.11) 


1934(1.18) 


1950(1.19) 


3059(1.87) 


Average 


(1.06) 


(1.08) 


(1.12) 


(1.90) 



YapOr overheads result from handling the work load register and from testing 
operations that (i) verify whether a node is shared or private, (ii) check for 
sharing requests, and (iii) check for backtracking messages due to cut operations. 
On the other hand, YapTab overheads are due to the handling of the freeze 
registers and support of the forward trail. OPTYap overheads result from both. 

Since OPTYap is based on the same environment model as the one used by 
YapOr, we then compare OPTYap’s parallel performance with that of YapOr. 
Table 2 shows the speedups relative to the single worker case for YapOr and 
OPTYap with 2, 3 and 4 workers. Each speedup corresponds to the best execu- 
tion time obtained in a set of 3 runs. The results show that OPTYap maintains 
YapOr’s behavior in exploiting or-parallelism in non-tabled programs, despite 
that it includes all the machinery required to support tabled programs. 

Tabled Benchmarks. We then use a set of tabled benchmark programs to measure 
the performance of the tabling engines in discussion. The benchmarks include two 
transition systems from XMC specs^, a same generation problem for a 24x24x2 
data cylinder, and two path problems that find the transitive closure of different 
graph topologies. All benchmarks find all the solutions for the problem. 

^ We are thankful to C.R. Ramakrishnan for providing us these benchmarks. 
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Table 2. Speedups for YapOr and OPTYap on non-tabled programs. 



Program 


YapO] 


[• 


OPTYap 


2 


3 


4 


2 


3 


4 


9-queens 


1.99 


2.99 


3.94 


2.00 


2.99 


3.96 


cubes 


2.00 


2.98 


3.95 


1.98 


2.96 


3.97 


ham 


2.00 


2.95 


3.90 


1.97 


2.93 


3.78 


nsort 


1.97 


2.92 


3.83 


1.97 


2.92 


3.80 


puzzle 


2.02 


3.03 


4.02 


1.98 


2.97 


3.94 


Average 


2.00 


2.97 


3.93 


1.98 


2.95 


3.89 



Table 3 shows the base running times, in milliseconds, for YapTab, OPTYap 
and XSB for the set of tabled benchmarks. In parentheses, it shows the overhead 
over the YapTab running times. The results indicate that OPTYap introduce, 
on average, an overhead of about 17% over YapTab for tabled programs, which 
is much worse than the overhead of 5% for non-tabled programs. The difference 
results from locking requests to handle the data structures introduced by tabling. 
Locks are require to insert new trie nodes into the table space, and to update 
subgoal and dependency frame pointers to tabled answers. We observed that the 
benchmarks that deal with more tabled answers per time unit are the ones that 
perform more locking operations and in consequence introduce further overheads. 



Table 3. Running times on tabled programs. 



Program 


YapTab 


OPTYap 


XSB 


xmc-sieve 


2851 


3226(1.13) 


3560(1.25) 


xmc-iproto 


2438 


2736(1.22) 


4481(1.84) 


same-gen 


16598 


17034(1.03) 


25390(1.82) 


path-grid 


1069 


1240(1.16) 


3610(3.38) 


path-chain 


102 


136(1.33) 


271(2.66) 


Average 


(1.17) 


(2.19) 



Regarding XSB, the results show that, on average, YapTab is slightly more 
than twice as fast as XSB, surprisingly a better result than for non-tabled bench- 
marks. In particular, XSB shows the worst behavior for the two programs that 
are more table intensive. We believe that the XSB performance may be caused 
by overheads in their tabling implementation. XSB must support negated lit- 
erals, and also has recently been extended to support attributed variables and 
especially subsumption. 

Parallel Tabled Benchmarks. To assess the performance of OPTYap when run- 
ning the tabled programs in parallel, we ran OPTYap for the same set of tabled 
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Table 4. Speedups for OPTYap on tabled programs. 





Number of Workers 


Program 


2 


3 


4 


xmc-sieve 


2.00 


3.00 


3.99 


xmc-iproto 


1.90 


2.78 


3.64 


same-gen 


2.04 


2.84 


3.86 


path-grid 


1.82 


2.54 


3.10 


Average 


1.94 


2.79 


3.65 


path-chain 


0.92 


0.86 


0.78 



programs with varying number of workers. Table 4 shows the speedups relative 
to the single worker case for OPTYap with 2, 3 and 4 workers. Each speedup 
corresponds to the best execution time obtained in a set of 3 runs. The table 
is divided in two blocks: the upper block groups the benchmarks that showed 
potential for parallel execution, whilst the lower block includes the benchmark 
that do not show any gains when run in parallel. 

Globally, our results show quite good speedups for the upper block programs, 
especially considering that the execution times were obtained in a multiprocess 
environment. In particular, xmc- sieve achieves linear speedups up to 4 workers. 
The same-gen benchmark presents also excellent results up to 4 workers and 
xmc-iproto and path-grid show a slightly slowdown with the increase in the 
number of workers. On the other hand, the path- chain benchmark does not 
show any speedup at all. 

Through experimentation, we observed that workers are busy for more than 
95% of the execution time, even for 4 workers. In general, slowdowns are not 
caused because workers became idle and start searching for work, as usually 
happens with parallel execution of non-tabled programs. Here the problem seems 
more complex: workers do have available work, but there is a lot of contention 
to access that work. 

Closer analysis suggested that there are two main reasons that constraint 
speedups. One relates with massive table access to insert and consume answers. 
As trie structures are a compact data structure, the presence of massive table 
access increases the number of contention points. The other relates with the 
sequencing in the order that answers are found. There are answers that can only 
be found when other answers are also found, and the process of finding such 
answers cannot be anticipated. This incurs in high overheads related with SCC 
suspensions and resumptions. 



7 Conclusions 

In this paper we have presented the design and implementation of OPTYap. 
To the best of our knowledge, OPTYap is the first parallel tabling engine for 
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logic programming systems. OPTYap extends the Yap Prolog system both with 
the SLG-WAM, initially implemented for XSB Prolog, and with environment 
copying, initially implemented in the Muse or-parallel system. 

First results show that OPTYap introduces low overheads for sequential exe- 
cution, and that it compares favorably with current versions of XSB. Moreover, 
the results showed that OPTYap maintains YapOr’s effective speedups in ex- 
ploiting or-parallelism in non-tabled programs. For parallel execution of tabled 
programs, OPTYap showed linear speedups for a well known application of XSB, 
and quite good results globally. These results emphasize our belief that tabling 
and parallelism are a very good match. 

On the other hand, there are tabled programs where OPTYap may not 
speedup up execution. Parallel execution of tabled programs may have different 
characteristics than traditional or-parallel programs. In general, tabling tends to 
decrease the height of the search tree, whilst increasing its breadth. We there- 
fore believe that improvements in scheduling and on concurrent access to tries 
may be fundamental for scalable performance. We plan to investigate this issue 
further, also by studying more programs. 
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Abstract. This paper revisits the classical cardinality operator introducing new 
propagation rules that operate on variables that occur in more than one con- 
straint. It also introduces a restricted case of the cardinality operator which is 
characterized by a structure of sliding constraints on consecutive variables. We 
call it cardinality-path and take advantage of these restrictions in order to come 
up with more efficient propagation algorithms. From an application point of 
view the cardinality-path constraint allows to express a host of regulation con- 
straints occurring in personnel planning problems. We have used the meta- 
programming services of Prolog in order to implement the cardinality-path con- 
straint within SICStus Prolog. 



1 Introduction 

Since its introduction 10 years ago, the cardinality operator [3] has been recognized 
as a generic concept which was progressively integrated in most of the modem con- 
straint systems. It has the form cardinality(c, 

where C is a domain variable* and is a set of con- 

straints. The cardinality operator holds iff: 

« t \ ( 1 ) 

i=\ 

where #CTRj{yn,..,Vu^J is equal to 1 if constraint holds and 0 other- 

wise. From an operational point of view the cardinality operator used entailment [4] 
in order to implement the corresponding propagation. However a fundamental weak- 
ness of the previous propagation scheme is that it assumes each constraint to be inde- 
pendent. In practice this is a bit too optimistic assumption since, very often, the same 
variables occur in all the constraints of the cardinality operator. The first contribution 
of this paper is to provide new pmning mles for the cardinality operator that take 
advantage of the fact that some variables occur in more than one constraint. 



* A domain variable is a variable that ranges over a finite set of integers; dom(F), min(F) and 
max( V) respectively denote the set of possible values of variable V, the minimum value of V 
and the maximum value of V. 

P. Codognet (Ed.): ICLP 2001, LNCS 2237, pp. 59-73, 2001. 
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The second part of this paper introduces a restricted case of the cardinality opera- 
tor which is characterized by a structure of sliding constraints on consecutive vari- 
ables. We call it cardinality-path and present generic propagation algorithms for this 
family of constraints. It regroups a set of global constraints that were described in [1]. 
The cardinality-path family constraint has the form cardinality_path(c, ^i,..,V„\CTR) 
where C is a domain variable, {Fi,..,F„} is a collection of domain variables and CTR 
is a A: -ary elementary constraint {2<k<n) . The constraint holds iff: 

n-k+l ( 2 ) 

1=1 

where is equal to 1 if constraint holds and 0 oth- 

erwise. Condition (2) expresses the fact that the cardinality-path constraint holds if 
exactly C constraints out of the set 
are satisfied. 

The first and second generic propagation algorithms that we present for the cardi- 
nality-path constraint take partially advantage of this specific structure in order to 
derive stronger pruning than the one that can be achieved by those rules described in 
[3, pages 749-751]. The third propagation algorithm combines the second algorithm 
with a special case of the new pruning rule introduced for the cardinality operator in 
order to derive stronger propagation. 

The cardinality-path constraint family is also useful for those over-constrained 
problems having the structure described in the previous paragraph. In this case it 
allows to get an upper bound of the maximum number of constraints that hold and to 
propagate in order to try to achieve this upper bound. 

In order to make all our propagation algorithms generic, constraint CTR is defined 
by the following functions: 

- enforce_Cr7?(F,,..,F,+j.^i): adds constraint to the constraint store, 

- enforce_A^(9r_C77?(k,-,..,k,+j,_i): adds the negation^ of constraint CTR(F,,..,F,+j,_i) 
to the constraint store. 

The previous functions trigger constraint propagation that will be carried on until 
saturation. Failure detection should be independent from the order in which con- 
straints CTR are posted. In addition we use also the following primitives: 

- create_choice_point: creates a choice point in order to be able to return to the cur- 
rent state later on, 

- backtrack: restores the state of the domain variables and of the constraint store as it 
was on the last call to create_choice_point. 

The next section presents new propagation rules for the cardinality operator. 
Sect. 3 describes the different aspects of cardinality-path: Sect. 3.1 first provides 
some instances of the cardinality _path constraint family; Sect. 3.2 and 3.3 show how 
to compute a lower and an upper bound of the number of elementary constraints that 
hold; Sect. 3.4 indicates how to prune variables Fi,..,F„ according to the minimum 
and maximum value of C ; finally. Sect. 3.5 shows how to partially integrate external 



^ The negation of constraint CTR is denoted —iCTR; -nCTR{Vi,..,Vi^) holds iff 
CTR{Vi,..,Vi^) does not hold. 
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constraints within the previous propagation algorithms. The last section provides for 
the cardinality-path constraint family a new pruning rule which combine the new 
pruning rule introduced for the cardinality operator in Sect. 2 with the algorithm 
described in Sect. 3.4. 



2 Revisiting the Cardinality Operator 

Consider the cardinality operator cardinality(c, 

and let Vi,..,Vp be a set of variables which all occur in all the different constraints 
CTRi,..,CTR„ . Let dom(Fy |crj?, ) denote the domain of variable Vj (l< j< p) under the 
assumption that CTR, (l < 1 < n) is enforcedh The new pruning rule is based on the 
following observation: a value vaZe dom(Fy) can remain in dom(F, ) only if 
va/e dom(Fy|cTZ?, ) for at least min(c) different values of i . In fact, this rule is a gen- 
eralization of constructive disjunction [4], [5]. This leads to the following algorithm. 

1 nf ail : =0 ; 

2 FOR i :=1 TO n TO 

3 create choice point; 

4 IF enforce_CTR(Vil, . . ,Viki) fails THEN nf ail : =nf ail+1 ; 

5 ELSE 

6 FOR j : =1 TO p TO 

7 FOR all valGdom(Vj) such that count_val [ j , val] ^<min (C) TO 

8 count_val [ j , val] : =count_val [ j , val] +1 ; 

9 backtrack; 

10 IF nfail>n-min (C) THEN RETURN fail; 

11 adjust max(C) to n-nfail. 

12 FOR i : =1 TO p DO 

13 FOR all valGdom(Vj) such that count_val [ j , val] <min (C) TO 

14 remove value val from Vj ; 

15 RETURN delay; 

The first part of the algorithm initializes the count_val matrix (lines 1-10) while 
the last part (lines 11-15) exploits the count_val matrix in order to prune. A similar 
procedure performs pruning according to the negation of the different constraints 
CTRi (l < Z < m) . If one is only interested by adjusting the minimum and maximum 
value of variables F, (l < Z < p) then we can instead use the following simplified ver- 
sion of the previous algorithm. 



3 Aom{y j\CTRi) is empty if CTRj leads to a contradiction. 
We assume count_val [j , val] to be initialized to 0. 




62 N. Beldiceanu and M. Carlsson 



1 nf ail : =0 ; 

2 FOR i : =1 TO n TO 

3 create choice point; 

4 IF enforce_CTR(Vil, . . ,Viki) fails THEN nf ail : =nf ail+1 ; 

5 ELSE 

6 FOR j : =1 TO p TO 

7 min_val [ j , i] : =min (Vj ) ; 

8 max_val [ j , i] : =max (Vj ) ; 

9 backtrack; 

10 IF nfail>n-min (C) THEN RETURN fail; 

11 adjust max(C) to n-nfail. 

12 FOR i : =1 TO p DO 

13 adjust min (Vj ) to the min(C) smallest value of min_val [ j , 1 . . n] ^ ; 

14 adjust max (Vj ) to the min(C) largest value of max_val [ j , 1 . . n] ^ ; 

15 RETURN delay; 

Line 13 removes all values vale dom{Fj) which are smaller than the min(c) small- 
est value of min(dom(Fy )) (l <;<«), while line 14 discards all values 
va/edom(Fy) which are greater than the min(c) largest value of max(dom(Fy|cri?, )) 
(l < i < «) . 



3 The Cardinality-Path Constraint Family 

3.1 Instances of the Cardinality-Path Constraint Family 

The purpose of this paragraph is to provide various concrete examples of the 
cardinality-path constraint family. These examples are given in Table 1 and a 
possible practical use is provided for each of them at the end of this subsection. 

The first column of Table 1 provides the arity k of the elementary constraint 
CTR , while the second column describes a member of the family in terms of the 
parameters of the cardinality-path family: it gives the initial lower and upper values 
for variable C and defines the elementary constraint CTR . Finally the last column of 
Table 1 describes the parameters of a family member and provides an example where 
the constraint holds. In order to make these examples more readable, two spaces after 
the value of a variable F, (l<;<«-A: + l) indicate that constraint CTR(F,,..,F,+j,_i) 

does hold. For instance, the example change(3, {4,4, 3, 4, of the first row of 
Table 1 denotes that all the next three following constraints hold 3 t^4, 4t^1, 

and that constraint does not hold. Note that the meaning of a family member 

can be derived from Condition (2) and from the second column of Table 1. 



^ We assume min val [ j , 1 . . n] to be initialized to max (vj ) . 
® We assume max val [ j , i . . n] to be initialized to min (vj ) . 
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Table 1. Members of the cardinality-path constraint family 



Arity 


Bound for C and elementary 
constraint CTR 


Member and example of a solution 


2 


C:0..n-l 


change{C,{Vi,..,V„}^) 
change(3, {4,4 , 3, 4, i},t^) 


2 


C:0..n-l 

(X,-+l)modZ7^X,+i 


cyclic_change(c, Z,{fj,..,F„}) 
cyclic_change(2, 5, {2, 3, 4,0, 2,3, l}) 


2 


C:0..«-1 

(Xj +l)modZ X,+[ A 
Xj <L A X,+i < L 


cyclic_change Joker(c, Z,{ki ,..,V „ }) 
cyclic_changeJoker(2, 4,{3,0, 2,4,4,4,3, 1,4}) 


2 


C:0..n-l 

|x,-x,^i|>r 


smooth(c, T, {Vi ,..,V„ 1) 
smooth(l,2, {l,3,4,5, 2|) 


3 


C:0..«-2 

X; = 0 A Xj + 1 = 0 A X, + 2 ^ 0 


number_of_rest(c, 

number_of_rest(2, {2,0, 0,1,1,0,2,0, 0,1, 2|) 


k 


C = n — k + \ 

low< ^ (xj in Values)^ < up 
j=‘ 


among_seq(/oH’, up,k,^i,..,V„}, Values ) 
among_seq(l,2,4, {9, 2, 4, 5, 5,7,2|,{0,2,4,6,8|) 


k 


C =n-k+\ 
i+k-l 

low< j <up 


sliding_sum(/ow, up,k,\Vi,..,V„}) 
sliding_sum(3,7,4, jl, 4, 2, 0, 0,3,4|) 


k 


C : atleast..atmost 
i+k-\ 

low< '^Xj<Up 


, _ ( atleast, Utmost Jow, Up, k,\ 

relaxed_sliding_sum j 

relaxed_sliding_sum(3,4,3,7,4, {2,4, 2, 0, 0,3,4}) 



Constraints change, cyclic _change, cyclic_change Joker, smooth, among_seq, slid- 
ing_sum and relaxed_sliding_sum were respectively described at pages 43, 44, 45, 46, 
40, 41 and 42 of [1]. From a practical point of view, the constraints of the previous 
table can be used for the following purpose: 

- change can be used for timetabling problems in order to put an upper limit on the 
number of changes during a given period, 

- cyclic_change may be used for personnel cyclic timetabling problems where each 
person has to work according to cycles that have to be sometimes broken, 

- cyclic_change Joker may be used in the same context as the cycle_change con- 
straint with the additional interpretation that holidays (i.e. those values that are 
greater than or equal to Z ) are not subject to cyclic constraint. 



7 



X in Values is equal to 1 if Xe Values , and 0 otherwise. 
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- smooth can be used to put a limit on the number of drastic variations on a given 
attribute (for example the number of persons working on consecutive weeks), 

- number_of_rest allows controlling the number of rest days over a period of work, 
where a rest day is a period of at least two consecutive days off and one work day, 

- among_seq may be used to express frequency constraints for producing goods for 
which one can have different variants (for example the car sequencing problem 
[2]), 

- sliding_sum allows to restrict the total number of working hours on periods of 
consecutive days, 

- relaxed_sliding_sum has the same utility as the sliding_sum constraint, but in addi- 
tion allows expressing the fact that the rule may be broken sometimes. 

More complete examples of utilization of the previous constraints can be found in 
[1]. The next two subsections indicate respectively how to compute a lower and an 
upper bound for the number of elementary constraints that hold. 



3.2 Computing a Lower Bound of the Minimum Number 
of Elementary Constraints That Hold 

The following greedy algorithm returns in min_break the minimum number of ele- 
mentary constraints that hold. It tries to impose the negation of constraint CTR on 
consecutive variables as long as no failure occurs. A failure will correspond to the fact 
that posting a new constraint -^CTR leads to a contradiction. In order to keep the 
propagation implied by enforce_NOT_CTR (Ui , . . ,Ui+k-l) (line 10) local to the 
constraints we state, we duplicate variables Fi,..,F„ . However, usual saturation is used 
for the constraints we enforce; in particular they can trigger each other until no further 
deduction is possible. Variables Ui,..,U„ will be deallocated when the last backtrack 
occurs. 

1 exist_choice_point : =1 ; 

2 create_choice_point ; 

3 copy variables VI.. Vn to Ul..Un; 

4 min_break : =0 ; 

5 FOR i:=l TO n-k-t-1 M 

6 XF exist_choice_point=0 THEN 

7 exist_choice_point : =1 ; 

8 create_choice_point ; 

9 END ; 

10 IF enforce_NOT_CTR(Ui, . . ,ui-t-k-l) fails THEN 

11 backtrack; 

12 exist_choice_point : =0 ; 

13 min_break : =min_break+l ; 

14 ENDIF ; 

15 END FOR ; 

16 XF exist_choice_point THEN backtrack END ; 
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Let’s call a maximal sequence, a sequence of consecutive variables 
{r>\, s<n, s-r + \>k) such that: 

- Propagation on the conjunction of constraints 

does not find a contradiction*, 

- 5 is equal to n or the propagation on the conjunction of constraints 

finds a contradiction. 

The greedy algorithm constructs a suite of maximal sequences of consecutive vari- 
ables. It returns a valid lower bound since stopping a maximum sequence earlier will 
not allow expanding the next maximum sequence further on to the right. The lower 
bound may not be sharp since: 

- It depends whether the propagation algorithm associated to constraint -^CTR is 
complete^ or not. 

- It depends on if we have a global propagation algorithm, which can take into ac- 
count or not the fact that consecutive constraints partially overlap (i.e. have k-\ 
variables in common). For example, consider ^CTR being the constraint 

Furthermore assume we have four 0-1 domain variables 
V\,V 2 ,V^,Vii. Suppose now that we apply -^CTR on each sliding sequence of four 
consecutive variables of the series 0,Fi,F2.F3>F4 (e-g- we have the two constraints 
0 + Fi + F2 + 13 = 2 and Fj + F2 + F3 + F4 = 2 ). If each of the previous constraint is 
propagated in an independent way we will miss the fact that F4 is equal to 0. 

- If -iCTi? is a constraint that involves more than 2 variables (i.e. k>2) then, the 
fact that we backtrack after a failure restores the domain of the variables to their 
initial state; however, since two consecutive maximum sequences have k-2 vari- 
ables in common, there is an interaction that is ignored by our algorithm. 

We illustrate the previous algorithm with an example of the cyclic _change constraint 
that was introduced in [1, page 44] and described in row 2 of Table 1. The cy- 
clic_change constraint is a member of the cardinality-path family constraint where 
CTR is the following binary constraint: (x, +l)modZ . Z, is a strictly positive 

integer. Constraint cyclic_change(2,5,{2,3,4,0,2,3,l}) holds since (x,- +l)mod5 7^ Xj+j is 
verified exactly 2 times, namely (0 + l)mod5 ^ 2 and (3 + l)mod5 7^ 1 . 

Let’s assume we have the constraint 

cyclic_change{C, 5,{Fi,V2,V2,V4,V^,V^,Vj,V^,Vg}) with the following initial domains: 
Fi:{0,3}, F2:{2,3,4}, F3 :{o,4}, F 4 :{o,l,2,3,4}, F5 : {0,1,2,3} , F6:{0,2,4}, F7 :{o,l,2}, 
Fg: {0,1,2}, F9 :{o, 4|. Table 2 gives the 2 maximum sequences Fi,F2,F3,F4,F5 and 
F6,F7,Fg built by the algorithm in order to evaluate the minimum number of con- 
straints that hold, namely 2 in this case. For each maximum sequence, a line in the 



* This does not mean that there is a solution for this conjunction of constraints since the propa- 
gation may be incomplete. 

^ A propagation algorithm for constraint CTR{Vi,..,V„) is called complete if after propagation 
Vie {1,2,.., nj, Vve dom{Vj), there exists at least one feasible solution for CTR(Fi,..,V„) 
with F] = V . 
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table represents the constraint that is currently added and the state of the domains 
after posting that constraint. 



Table 2. Maximum sequences of consecutive variables built by the algorithm 



First maximal sequence 


(Fi -H)mod5 = Fj 
(^2 +l)mod5 = F 3 
(F 3 +l)mod5 = F 4 
(F 4 +l)mod5 = F 5 
(F 5 -H)mod5 = Fg 


Fi:|0,3| 

Fi :{3} F 2 :{4} 

Fi:{3}F2:{4}F3:{0} 

Fi:{3}F2:{4}F3:{0}F4:{i} 

Fi:{3}F2:{4}F3:{0}F4:{i}F5:{2} 

contradiction 


Second maximal sequence 


(Fg -H)mod5 = Fy 
{V-j +l)mod5 = Fg 
(Vg +l)mod5 = F 9 


Fg: { 0 , 2 , 4 } 

Fg :{0,4| Fy :{0,l| 

Fg :{0,4| Fy :{0,l| Fg :{l,2| 
contradiction 



3.3 Computing an Upper Bound of the Maximum Number 
of Elementary Constraints That Hold 

We derive an upper bound (5) of the maximum number of elementary constraints that 
hold from the following two identities (3) and (4): 

(3) 

1=1 

C + D = n-k + \ , (4) 

max(c)<M- i + l-min(D). (5) 

Identity (3) introduces quantity D , which is the number of times that the negation 
of constraint CTR holds on variables Vi,..,V„ (i.e. the number of discontinuities). 
Identity (4) states that the number of elementary constraints that hold plus the number 
of constraints that do not hold is equal to the total number of constraints n-k + l . 
Finally, Inequality (5) expresses the upper bound of the maximum number of elemen- 
tary constraints that hold in term of the lower bound of the minimum number of con- 
tinuities. In order to evaluate min(D) , we use the algorithm described in Sect. 3.2, 
where we replace enf orce_NOT_CTR by enf orce_CTR. 





Cardinality Operator and Cardinality-Path Constraint Family 67 



3.4 Pruning According to the Minimum and Maximum Number 
of Elementary Constraints That Hold 

We use the following algorithm in order to prune variables according to the 

maximum value of variable C . We remove values that otherwise would cause a too 
big number of elementary constraints CTR to hold. 

1 i:=l; 

2 FOR inc:=l TO -1 ( STEP -2) DO 

3 exist_choice_point : =1 ; 

4 create_choice_point ; 

5 copy variables VI.. Vn to Ul..Un; 

6 min_break : =0 ; 

7 WHILE l<i AND i<n-k+l ^ 

8 ^ inc=l THEN before [i] : =min_break 

9 ELSE after [i] : =min_break ENDIF ; 

10 ^ exist_choice_point=0 THEN 

11 exist_choice_point : =1 ; 

12 create_choice_point ; 

13 ENDIF ; 

14 IF enforce_NOT_CTR(Ui, . . ,Ui+k-l) fails THEN 

15 backtrack; 

16 exist_choice_point : =0 ; 

17 min_break : =min_break+l ; 

18 ENDIF ; 

19 IZ inc=l THEN record dom(Ui+k-l) in vbefore [i+k-1] 

20 ELSE record dom(Ui ) in vafter [i ] ENDIF; 

21 i : =i+inc ; 

22 END WHILE ; 

23 XF exist_choice__point THEN backtrack END ; 

24 i:=n-k+l; 

25 END FOR ; 

26 IF min_break=max (C) THEN 

27 FOR i:=l TO n-k+1 M 

28 IF before [i] +af ter [i] +l>max (C) THEN 

29 enforce_NOT_CTR(Vi, . . ,Vi+k-l) ; 

30 ENDIF ; 

31 ENFQR ; 

32 ENDIF ; 

33 IF max (C) -min_break<l THEN 

34 FOR i:=k TO n-k+1 M 

35 IF before [i-k+2] +after [i-1] +2>max (C) THEN 

36 remove values v not in vbefore [i] vjvaf ter [i] from Vi; 

37 ENDIF ; 

38 END FOR ; 

39 ENDIF ; 

When inc is equal to 1, the first part of the algorithm (lines 1 to 25) computes the 
minimum number of constraints that hold in the set of constraints 
{CT7 ?(Fi ,..+0,.., )} for each i between 1 and n-k + l . This number 

is recorded in bef ore[ i ]. We also initialize the sets of values vbef ore[ ; ] (k<i<n) 
to the values that are still in the domain of variable F, just after propagating con- 



When i is equal to 1, the previous set is empty and before);] is equal to 0. 
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straint ) This is the first constraint that mentions variable f,- when 

we scan the constraints from left to right. 

When inc is equal to -1, the first part of the algorithm (lines 1 to 25) computes the 
minimum number of constraints that hold in the set of constraints 

for each i between 1 and n-k + \^^. This num- 
ber is stored in after[;]. We also initialize the sets of values vafter[;] 
(« - ^ + 1 > 1 > l) to the values that are still in the domain of variable f, just after 
propagating constraint This is the first constraint that mentions 

variable F, when we scan the constraints from right to left. 

The min_break counter (line 17) is processed twice by the +1 and -1 loops (line 
2). Because of the hypothesis made in the introduction “failure detection should be 
independent from the order in which constraints CTR are posted” we get twice the 
same value for the min_break counter. 

The second part of the algorithm (lines 26 to 32) enforces constraint 
to hold if the minimum number of constraints that hold in the set 

{cri?(Fi,..,Fj.),..,Crj?(F,_i,..,F,+j._ 2 )} plus the minimum number of constraints that hold 
in the set {crj?(F,+i,..,F,+j.),..,Crj?(F„_j,+i,..,F„)} is just equal to the maximum possible 
number of elementary constraints that hold. 

The third part of the algorithm (lines 33 to 39) removes from a variable the values 
that cause two distinct additional elementary constraints to hold. We now prove that 
the third part of the algorithm removes only values that would lead to a failure of the 
cardinality-path constraint. 

CASE 1: Assume that posting constraint or constraint 

^CTR{Vi,.„Vi^,_,) did generate a failure (line 14). Since we backtrack (line 15), 
vbfore[i] (line 19) or vafter[i] (line 20) would contain all values of variable F, . 
We derive from this fact that no value will be removed from variable F, (line 36). 

CASE 2: Let us assume that posting constraint -iCrj?(F,_j.+i,..,F,) and constraint 
did not generate any failure (line 14). In this case, we show that if 
F, g vbefore[/]uvafter[;] then the minimum number of constraints of 
{crj?(Fi,..,Fj.),..,CTR(F'„_j.+i,..,F„)} that hold is greater than or equal to 

bef ore[; — k + 2]+ af ter[; — l]-H 2 . 



"If — iCI7?(F,_j.+[,..,F,) finds a contradiction then vbefore[i] is initialized to the initial 
domain of variable F, . 

When i is equal to n-k + \, the previous set of constraints is empty and after); ] is equal 
to 0. 

If finds a contradiction then vafter); ] is initialized to the initial do- 

main of variable F, . 
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Let us note: 

- min(a,6) the minimum number of constraints that hold in the conjunction of con- 

CTR 

straints where 

l<a<6<«-/S:-Hl. 

- / the smallest value less than or equal to i-k + \ such that the following two 
conditions are true: 

• No failure was detected during the first iteration of the algorithm (when 

inc=l) on the conjunction of constraints 

• / = 1 or a failure was detected after stating Fy+j-_ 2 )- 

- / the largest value greater than or equal to i such that the following two condi- 
tions are true: 

• No failure was detected during the second iteration of the algorithm (when 

inc=-l) on the conjunction of constraints 

• l = n-k + \ or a failure was detected after stating ,..,Vi+k ) • 

We have: 

Partitioning the set of constraints in the set 

[CTR{V^,..,V^),..,CTR{yf^^,..,Vf^k- 2 )\ and in 
leads to min(l,i -A:H-1)> min(l,/-l)H-mm(/,; -A:- h1) . 

CTR CTR CTR 

From the definition of / we have that: min(l,/-l)>before[/] = before[;-A:-H2]. 

CTR 

Since F, i vbeforefil we also have that: min(/,;-A: + l)>l . 

CTR 

So we conclude that: min(l,i-A: + l)>before[i-A:H-2] + l . 

CTR 

In a similar way, we have: 

Partitioning the set of constraints \CTR{Vi 

-k+U-’^/i )} in the set 

leads to min(h« -A:-h 1)> mm(i,/) + mm(/-rl,« + l) . 

CTR CTR CTR 

Since F, g vafter[;l we also have that: min(;,/)> 1 . 

CTR 

From the definition of / we have that: min(/-Hl,M-A: + l)>after[/]=after[;-l]. 

CTR 

So we conclude that: min(i,«-A: + l)>l-Hafter[t-l]. 

CTR 

So the minimum number of constraints that hold in \CTR{yi,..,Vi^\..,CTR{y„_ /fc+lv,Fn)} 
is greater than or equal to be f or e[t - A: + 2 ] -r a f t e r[i - 1] -r 2 . □ 

Table 3 gives an example of execution of the previous algorithm on the example 
introduced at the end of Sect. 3. 
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Table 3. Tables bef ore[], af ter[], vbef ore[], vaf ter[] built by the algorithm 



variables 


Fi 


F 2 


V, 


I 4 


V 5 


Fg 


Fy 


Fs 


F 9 


domains 


0,3 


2,3,4 


0,4 


0,1, 2,3,4 


0,1,2, 3 


0,2,4 


0,1,2 


0,1,2 


0,4 


before!] 


0 


0 


0 


0 


0 


1 


1 


1 




vbefore[] 




4 


0 


1 


2 


0,2,4 


0,1 


1,2 


0,4 


after!] 


2 


2 


1 


1 


1 


1 


1 


0 




vafter!] 


3 


3,4 


0,4 


2 


3 


0,4 


0,1 


0,1,2 





If at most two constraints should hold then part 2 (lines 26-32) will perform the 

following pruning: 

- since before[l]+after[l]+l=3>2, line 29 imposes constraint (Fj +l)mod5 = F2 , 
which fixes Fj to 3 and F2 to 4, 

- since before[2]+after[2]+l=3>2, line 29 imposes constraint (F2 +l)mod5 = F3 , 
which fixes F3 to 0, 

- since before[6]+after[6]+l=3>2, line 29 imposes constraint (Fg +l)mod5 = Fy , 
which removes value 2 from variables Fg and Fy , 

- since before[7]+after[7]+l=3>2, line 29 imposes constraint (Fy +l)mod5 = Fg , 
which removes value 0 from variable Fg . 



If at most two constraints should hold then part 3 (lines 33-39) will perform the 

following pruning: 

- since before[2]+after[l]+2=4>2, line 36 removes values in 
dom(F2 ) — (vbe fore [2]uvafter[2]) = {2,3,4}-{3,4}={2} from variable Fy , 

- since before[3]+after[2]+2=4>2, line 36 removes values in 

dom(F3)-(vbefore[3]uvafter[3]) = {o,4}-{o,4}=0 from variable F3 , 

- since before[4]+after[3]+2=3>2, line 36 removes values in 

dom(F4 )— (vbef ore[4]cj vaf ter [4]) = {0,1, 2, 3, 4}- {1,2} ={0,3, 4} from variable F4 , 

- since before[5]+after[4]+2=3>2, line 36 removes values in 

dom(F5)-(vbefore[5]uvafter[5]) = {0,l,2,3}-{2,3}={0,l} from variable Fj , 

- since before[6]+after[5]+2=4>2, line 36 removes values in 

dom(Fg)-(vbefore[6]uvafter[6]) = {o,2,4}-{o,2,4}=0 from variable Fg , 

- since before[7]+after[6]+2=4>2, line 36 removes values in 

dom(Fy)-(vbefore[7]uvafter[7]) = {0,l,2}-{0,l}={2} from variable Fy , 

- since before[8]+after[7]+2=4>2, line 36 removes values in 

dom(Fg ) — (vbe fore [8]uvafter[8]) = {o,l,2}-{o,l,2}=0 from variable Fg . 

Finally, if at most three constraints should hold then part 3 (lines 33-39) will per- 
form the following pruning: 

- since before[2]+after[l]+2=4>3, line 36 removes values in 

dom(F2 ) — (vbe fore [2]uvafter[2]) = {2,3,4)-{3,4}={2} from variable Fy , 

- since before[3]+after[2]+2=4>3, line 36 removes values in 

dom(F3)-(vbefore[3]uvafter[3]) = {o,4}-{o,4}=0 from variable F3 , 
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- since before[6]+after[5]+2=4>3, line 36 removes values in 

dom(Fg)-(vbefore[6]uvafter[6]) = { 0 , 2 , 4 }- { 0 , 2 , 4 } = 0 from variable Fg , 

- since before[7]+after[6]+2=4>3, line 36 removes values in 

dom(Fy )— (vbefore[7]cj vaf ter [ 7 ]) = { 0 , 1 , 2 }- { 0 , 1 }= { 2 } from variable V-j , 

- since before[8]+after[7]+2=4>3, line 36 removes values in 

dom(Fg ) — (vbe fore [8]uvafter[8]) = {o,l,2}-{o,l,2}=0 from variable Vg . 

A similar pruning for variables is done according to the minimum value of 

variable C. For this purpose we use the same algorithm, where we replace en- 
force_NOT_CTR by enforce_CTR and max(C) by n-k+l-min (C) . The previous 
quantity is the maximum number of constraints of the form —iCTR that hold. 



3.5 Integrating Partially the “External World” 

The purpose of this paragraph is to show how to partially integrate external con- 
straints within some of the propagation algorithms of the cardinality-path constraint 
family. This is not a very common approach, since usually most of the constraint 
algorithms are local to a given constraint. However, this is especially relevant for 
getting stronger propagation. 

The algorithm of Sect. 3.2, which computes a lower bound of the minimum num- 
ber of elementary constraints that hold, can be modified as follows. We do not dupli- 
cate (line 3) any more the variables Fi,..,F„, but work directly on them. This will 
result in waking the constraints we state inside the algorithm but also the external 
constraints mentioning a variable for which the domain is reduced. Finally, this may 
produce shorter maximum sequences than those obtained by the original algorithm. If 
this were the case, this would allow getting an improved lower bound of the minimum 
number of elementary constraints that hold. 



4 Redundant Constraints for Cardinality-Path 

This section shows how to generate redundant cardinality constraints for the cardinal- 
ity-path constraint family. The goal is then to use the algorithms presented in Sect. 2 
in order to perform additional propagation from these redundant constraints. The idea 
is as follows: 

- first consider all the constraints Cr7?(F,_j.+i,..,F,),..,Cr7?(F,-,..,F,+j,_i) mentioning a 
given variable F,- {k<i<n-k + \) of the cardinality-path constraint family, 

- secondly, use the information provided by before[] and after[] in order to evalu- 

ate the minimum number of constraints CTR that hold both in 
CTR{v,,..,V,\..,CTR{Vi^^,.„Vi_,) and in ; this is 

equal to before[i-k+l]+after[i]. Since at most max(c) constraints hold in the 
whole set of constraints CTR{Vy,..,Vi^),..,CTR{y^_i^^i,..,V^) it follows that at most 
max(c)-before[; -A: + l]-after[;] constraints hold in 

Cr7?(F,_^+i,..,F,),..,Cr7?(F,-,..,F,+j._i). Given that the previous set contains k con- 
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straints we have that at least ^-max(c)+ before[/-A:+l]+after[/] constraints from 

should fail. 

- finally, since is a variable which occurs in all constraints 

and given that at least A:-max(c)+ be- 

fore[i-^+l]+after[i] such constraints should fail we adapt the algorithm of Sect. 
2 in the following way in order to try to prune variable F, . 

1 min_c : =k-max (C) +before [i-k+1] +after [i] ; 

2 FOR j:=i-k+l TO i DO 

3 create_choice_point ; 

4 IF enforce_NOT_CTR(Vj , . . ,Vj+k-l)^ifail THEN 

5 FOR all valGdom(Vi) such that count_val [val] ^'^<min_c ^ 

6 count_val [val] : =count_val [val] +1 ; 

7 backtrack; 

8 FOR all valGdom(Vi) such that count_val [val] <min_c ^ 

9 remove value val from Vi; 

10 RETURN delay; 

As an illustrative example consider the constraint 
relaxed_sliding_sum(2,3,0,9,3, {r[,F 2 ,F 3 ,F 4 ,F 5 }) where the initial domains of vari- 
ables Vi,V 2 ,V^,V 4 ,V^ respectively are 1..9, 4. .9, 1..9, 3. .9 and 1..9. It imposes that at 
least 2 and at most 3 constraints of the form 0<F, h-F,+i +F ,+2 ^9(l</<3) to hold. 
The standard pruning introduced in Sect. 3 reduces the maximum value of to 5, 
but the previous propagation technique further restricts the maximum value of F 3 to 
4. This is because two constraints enforce the maximum value of F 3 to be less than or 
equal to value 4: 0<r[ h-F 2 h-F 3 <9 imposes max(F 3)<4 , while 0 < F 2 -hF 3 h-F 4 <9 
enforces max(F 3 )< 2 . 



5 Conclusion and Open Questions 

As a first contribution we have introduced for the cardinality operator new propaga- 
tion rules which are not only based on entailment, as has been the case for the last 10 
years. In fact, these rules can be regarded as a generalization of constructive disjunc- 
tion [4]. As a second contribution, we have presented generic propagation algorithms 
for the cardinality-path constraint family. We have also shown how to extend these 
propagation algorithms in order to consider the influence of external constraints that 
share some variable with the cardinality-path constraint. As one can observe, one of 
the main advantages of generic propagation algorithms is that they can be applied to 
all constraints having some internal structure [1] in common. One should notice that 
the algorithms presented in this paper are still valid when the elementary constraints 
CTR do not all correspond to the same constraint. However from the cardinality-path 
constraint family perspective this is somehow a weakness since it means that we did 
not take advantage from the fact that we have the same constraint CTR . 



We assume count_val [val] to be initialized to 0. 

This constraint was introduced in the last entry of Table 1 of Sect. 3.1. 
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The following example shows that, even when the arity of the elementary con- 
straint is equal to 2, our algorithm does not always find out that no solution exists. If 
we consider constraint cardinality_path(l, {o,Fi,F 2 ,F 3 ,o},^*) such that Vx,V 2 ,V^ are 0-1 

domain variables, then there is no solution since the number of satisfied disequality 
constraints is even. However the current algorithm seems to make a complete pruning 
in the case of binary constraints such as the less or the greater constraints. 

From the previous remarks one can ask the following questions: 

- For which class of binary constraints does our algorithm lead to complete pruning? 

- For which class of binary constraints is there no need to perform saturation in order 
to get a complete pruning? In this case, propagation concerns only the elementary 
constraint that is currently posted and not the elementary constraints that were al- 
ready posted. 

- How to extend our algorithm in order to get more propagation for some classes of 
elementary constraints? 

- Is it valid to apply the modification indicated in Sect. 3.5 to the pruning algorithm 
given in Sect. 3.4? 



Acknowledgements 

Thanks to Per Mildner and Emmanuel Poder for useful observations on an earlier 
draft of this paper as well as to anonymous reviewers for their comments. 



References 



1. Beldiceanu, N.: Global Constraints as Graph Properties on Structured Network of Elemen- 
tary Constraints of the Same Type. SICS Technical Report T2000/01, (2000). 

2. Dincbas, M., Simonis, H., Van Hentenryck, P.: Solving the Car-Sequencing Problem in 
Constraint Logic Programming. ECAI 1988, 290-295, (1988). 

3. Van Hentenryck, P., Deville, Y.: The Cardinality Operator: A New Logical Connective for 
Constraint Logic Programming. ICLP 1991, 745-759, (1991). 

4. Van Hentenryck, P., Saraswat, V., Deville, Y.: Design, Implementation and Evaluation of 
the Constraint Language cc(FD). In A. Podelski, ed., Constraints: Basics and Trends, vol. 
910 of Lecture Notes in Computer Science, Springer- Verlag, (1995). 

5. Wiirtz, J., Muller, T.: Constmctive Disjunction Revisited. In 20“‘ German Annual Conf. On 
Al, LNAI 1137, 377-386, eds. G. Gorz and S. Holldobler, Springer- Verlag, (1996). 




Optimizing Compilation 
of Constraint Handling Rules 



Christian Holzbaur^, Maria Garcia de la Banda^, 

David Jeffery^, and Peter J. Stuckey^ 

^ Dept, of Medical Cybernetics and Art. Intelligence, University of Vienna, Austria 

christianSai .univie . ac . at 

^ School of Comp. Sci. & Soft. Eng., Monash University, Australia 
{mbanda,dgj }@csse .monash.edu. au 
® Dept, of Comp. Sci. & Soft. Eng., University of Melbourne, Australia 

pjsScs .mu.oz.au 



Abstract. CHRs are a multi-headed committed choice constraint lan- 
guage, commonly applied for writing incremental constraint solvers. CHRs 
are usually implemented as a language extension that compiles to the un- 
derlying language. In this paper we discuss the optimizing compilation of 
Constraint Handling Rules (CHRs). In particualr, we show how we can 
use different kinds of information in the compilation of CHRs in order 
to obtain access efficiency, and a better translation of the CHR rules 
into the underlying language. The kinds of information used include the 
types, modes, determinism, functional dependencies and symmetries of 
the CHR constraints. We also show how to analyze CHR programs to 
determine information about functional dependencies, symmetries and 
other kinds of information supporting optimizations. 



1 Introduction 

Constraint handling rules [3] (CHRs) are a very flexible formalism for writing 
incremental constraint solvers and other reactive systems. In effect, the rules 
define transitions from one constraint set to an equivalent constraint set. Tran- 
sitions serve to simplify constraints and detect satisfiability and unsatisflability. 
CHRs have been used extensively (see e.g. [4]). Efficient implementations are 
already available for the languages SICStus Prolog and Eclipse Prolog, and will 
soon appear for others such as Java [5] and HAL [2]. 

In this paper we discuss how to improve the compilation of CHRs by using 
additional information derived either from declarations provided by the user 
or from the analysis of the constraint handling rules themselves. The major 
improvements we discuss over previous papers [4] on CHR compilation are: 

— general index structures which are specialized for the particular joins re- 
quired in the CHR execution. Previous CHR compilation was restricted to 
two kinds of indexes: simple lists of constraints for given Name/Arity and 
lists indexed by the variables involved. For ground usage of CHRs this meant 
that only list indexes were used. 
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— continuation optimization, where we use matching information from rules 
earlier in the execution to avoid matching later rules. 

— optimizations that take into account algebraic properties such as functional 
dependencies, symmetries and the set semantics of the constraints. 

We illustrate the advantages of the various optimizations experimentally on a 
number of small example programs in the HAL implementation of CHRs. We 
also discuss how the extra information required by HAL in defining CHRs (that 
is, type, mode and determinism information) is used to improve the execution. 

In part, some of the motivation of this work revolves around a difference 
between CHRs in Prolog and in HAL. HAL is a typed language which does not 
(presently) support attributed variables. Prolog implementations of CHRs rely 
on the use of attributed variables to provide efficient indexing into the constraint 
store. Hence, we are critically interested in determining efficient index structures 
for storing constraints in the HAL implementation of CHRs. An important ben- 
efit of using specific index structures is that CHRs which are completely ground 
can still be efficiently indexed. This is not exploited in the current Prolog im- 
plementations. As many CHR solvers only use ground constraints this is an 
important issue. 

2 Constraint Handling Rules and HAL 

Constraint Handling Rules manipulate a global multiset of primitive constraints, 
using multiset rewrite rules which can take three forms 

simplification [name©] ci , . . . , c„ g \ d\, . . . ,dm 

propagation [name©] ci , . . . , c„ g [ di , . . . , dm 

simpagation [name©] ci, . . . , c; \ c;+i , . . ■ ,Cn g [ di, . ■ ■ , dm 

where name is an optional rule name, ci,...,c„ are CHR constraints, g is a, 
conjunction of constraints from the underlying language, and d\, . ■ ■ , dm is a 
conjunction of CHR constraints and constraints of the underlying language. The 
guard part g is optional. If omitted, it is equivalent to g = true. 

The simplification rule states that given a constraint multiset {c '^, . . . , c(j} and 
substitution 9 matching the multiset {ci, . . . , c„}, i.e. {c'^, . . . , c'^} = 9{{ci , . . . , 
c„}), where the execution of 9{g) succeeds, then we can replace {c'l , . . . , c(j} by 
multiset 9{{d \, . . . , dm})- The propagation rule states that, for a matching con- 
straint multiset {c(, . . . , c(j} where 9{g) succeeds, we should add 9{{di , . . . , dm})- 
The simpagation rules states that, given a matching constraint multiset {c [, . . . , 
c(j} where 9{g) succeeds, we can replace . . . , c(,} by 6{{di, . - - ,dm})- A 

CHR program is a sequence of CHRs. 

The operational semantics of CHRs exhaustively apply rules to the global 
multiset of constraints, being careful not to apply propagation rules twice on 
the same constraints (to avoid infinite propagation). For more details see e.g. [1]. 
Although CHRs have a logical reading (see e.g. [3]) and programmers are en- 
couraged to write confluent CHR programs, there are applications where a pre- 
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dictable order of rule applications is important. Hence, their textual order is 
used to resolve rule applicability conflicts in favor of earlier rules. 

In this paper we focus on the implementation of CHRs in a programming 
language, such as HAL [2], which requires programmers to provide type, mode 
and determinism information. A simple example of a HAL CHR program to 
compute the greatest common divisor of two positive numbers a and b (using 
the goal gcd(a) , gcd(5)) is given below. 



module gcd. (LI) 

: - import int . (L2) 

export constraint gcd(int) . (L3) 

mode gcd(in) is det . (L4) 

base 0 gcd(O) <=> true. (L5) 

pair 0 gcd(N) \ gcd(M) <=> M >= N I gcd(M-N) . (L6) 



The first line (LI) states that the file defines the module gcd. Line (L2) im- 
ports the standard library module int which provides (ground) arithmetic and 
comparison predicates for the type int. Line (L3) exports the CHR constraint 
gcd/1 which has one argument, an int. This is the type declaration for gcd/1. 
Line (L4) is an example of a mode of usage declaration. The CHR constraint 
gcd/l’s first argument has mode in meaning that it will be fixed (ground) when 
called. The second part of the declaration “is det” is a determinism statement. 
It indicates that gcd/ 1 always succeeds exactly once (for each separate call) . For 
more details on types, modes and determinism see [2,6]. 

Lines (L5) and (L6) are the two CHRs defining the gcd/1 constraint. The 
first rule is a simplification rule. It states that a constraint of the form gcd(O) 
should be removed from the constraint store to ensure termination. The second 
rule is a simpagation rule. It states that given two different gcd/ 1 constraints in 
the store, such that one gcd(M) has a greater argument than the other gcd(N) we 
should remove the larger (the one after the \), and add a new gcd/1 constraint 
with argument M-N. Together these rules mimic Euclid’s algorithm. 

The requirement of the HAL compiler to always have correct mode and deter- 
minism information means that CHR constraints can only have declared modes 
that do not change the instantiation state of their arguments,^ since the com- 
piler will be unable to statically determine when rules Are. The same restriction 
applies to dynamically scheduled goals in HAL (see [2]).^ 

3 Optimizing the Basic Compilation of CHRs 

Essentially, the execution of CHRs is as follows. Every time a new constraint 
(the active constraint) is placed in the store, we search for a rule that can Are 
given this new constraint, i.e., a rule for which there is now a set of constraints 

^ They may actually change the instantiation state but this cannot be made visible 
to the mode system. 

^ Unlike dynamically scheduled goals in HAL, CHR constraints can have multi or 
nondet determinism. 
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that matches its left hand side. The first such rule (in the textual order they 
appear in the program) is fired. 

Given this scheme, the bulk of the execution time for a CHR 

Cl , . . . , C/ [, \]c/-|_i , . . . , Cn Q I di , . . . , dm 

is spent in determining partner constraints c ^, . . . , ■ ■ ■ ,c'n for an 
active constraint c' to match the left hand side of the CHR. Hence, for each 
rule and each occurrence of a constraint, we are interested in generating efficient 
code for searching for partners that will cause the rule to fire. We will then 
link this code together to form the entire program for the constraint. A more 
detailed description of the overall process is given in [4], which is the basis for 
the SICStus Prolog version of CHRs. In the rest of this section, when applicable, 
we will show how different kinds of compile-time information can be used to 
improve the resulting code in the HAL version of CHRs. 



3.1 Join Ordering 

The left hand side of a rule together with the guard defines a multi-way join with 
selections (the guard) that could be processed in many possible ways, starting 
from the active constraint. This problem has been extensively addressed in the 
database literature. However, most of this work is not applicable since in the 
database context they assume the existence of information on cardinality of re- 
lations (number of stored constraints) and selectivity of various attributes. Since 
we are dealing with a programming language we have no access to such infor- 
mation, nor reasonable approximations. Another important difference is that, 
often, we are only looking for the first possible join partner, rather than all. In 
the SICStus CHR version, the calculation of partner constraints is performed 
in textual order and guards are evaluated once all partners have been identi- 
fied. In HAL we determine an optimal join order and guard scheduling using, in 
particular, mode information. 

Since we have no cardinality or selectivity information we will select a join 
ordering by using the number of unknown attributes in the join to estimate 
its cost. We assume an initial set Fixed of known variables (which arises from 
the active constraint), together with the set of (as yet unprocessed) partner 
constraints and guards. The algorithm measure shown in Figure 1, takes as inputs 
the set Fixed, the sequence Partners of partner constraints in a particular 
order, the set FDs of functional dependencies and the set Guards of guards, 
and returns the triple {Measure, Goal, Lookups). Measure is an ordered pair 
representing the cost of the join for the particular order given by Partners. It 

is made up of the weighted sum (n — I)wi + {n — 2)w2 H h lw„_i of the costs 

Wi for each individual join with a partner constraint. The cost of an individual 
join is defined as a pair: the number of arguments in the new partner which are 
unfixed before the join; followed by the (negative of) the number of arguments 
which are fixed before the join. Goal gives the ordering of partner constraints 
and guards (with guards scheduled as early as possible). Finally, Lookups gives 
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measure{Fixed,Partners,FDs,Guards) 

Lookups ■— 0; Goal ~ true-, score := (0,0); sum ;= (0,0) 
while true 
repeat 

FixedO := Fixed 
foreach g € Guards 
if invar s{g) C Fixed 

Goal := Goal,g-, Fixed := Fixed U outvars(g)-, Guards Guards \ {g} 
until Fixed = FixedO 

if Partners — 0 return [score, Goal, Lookups) 
let Partners = p(x). Partner si 
Partners := Partners! 

FDp ■- {p{x) fd £ FDs} 

Fixedp -.= lAc\ose[Fixed, FDp) 

fixedx := xCi Fixedp 

cost ;= (\x \ fixedx\, —\fixedx\) 

score := score + sum + cost-, sum -.= sum + cost 

Lookups Lookups U {p{{xi G fixedx ? Xi : L) \ Xi £ x)} 

Fixed := FixedUx 
Goal := Goal,p{x); 
endwhile 

Fig. 1. Algorithm for evaluating join ordering 

the queries. Queries will be made from partner constraints, where a variable 
name indicates a fixed value, and an underscore (_) indicates an unfixed value. 
For example, query p(X,_,X,Y,_) indicates a search for p/5 constraints with a 
given value in the first, third, and fourth argument positions, the values in the 
first and third position being the same. 

Here we see the usefulness of mode information which allows us to sched- 
ule guards as early as possible. For simplicity, we treat mode information in 
the form of two functions: invars and outvars which return the set of input 
and output arguments of a procedure. We also assume that each guard has ex- 
actly one mode (it is straightforward to extend the approach to multiple modes 
and more complex instantiations). Functional dependencies are represented as 
p{x) S X where S'Uja:} C x meaning that for constraint p fixing all the vari- 
ables in S means there is at most one solution to the variable x. The function 
fdclose(Fia;e(i, FZ/s) closes a set of fixed variables Fixed under the functional 
dependencies. lAc\ose{Fixed,F D s) is the least set F D Fixed such that for each 
p{x) :: S x £ FDs such that S C F then x £ F. 

Example 1. Consider the compilation of the rule: 

p(X,Y), q(Y,Z,T,U), flag, r(X,X,U) \ s(W) ==> W = U + 1 , linear(Z) I p(Z,W) . 

for active constraint p(X,Y) and Fixed = {X,Y}. The scores calculated for 
the left-to-right partner order illustrated in the rule are (3,-1), (0,0), (0,-2), 
(0, —1) for a total cost of (12, —9)^ together with goal 

® Note that the cost of r(X,X,U) is (0,-2) because W = U + 1 is executed before 
r(X,X,U) thus grounding U. Also note that X is counted only once. 
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q(Y,Z,T,U), W = U + 1, linear(Z), flag, r(X,X,U), s(W) 

and lookups q(Y, flag, r(X,X,U), s(W). An optimal order has cost 
(5, —7) resulting in goal 

flag, r(X,X,U), W = U + 1, s(W), q(Y,Z,T,U), linear(Z) 

and lookups flag, r(X,X,_), s(W), q(Y,_,_,U). 

For active constraint q(Y,Z,T,U), the optimal order has cost (2,-8) resulting 
in goal 

W = U + 1, linear(Z), s(W), flag, p(X,Y), r(X,X,U) 
and lookups s(W), flag, p(_,Y), r(X,X,U). 

For rules with large left hand sides where examining all permutations is too 
expensive we can instead greedily search for a permutation of the partners that 
is likely to be cost effective. In practice, we have not required this as left hand 
sides of CHRs are usually small. 



3.2 Index Selection 

Once join orderings have been selected, we must determine for each constraint 
a set of lookups of constraints of that form in the store. We then select an index 
or set of indexes for that constraint that will efficiently support the lookups 
required. Finally, we choose a data structure to implement each index. Mode 
information is crucial to the selection of index data structures. If the terms 
being indexed on are not ground, then we cannot use tree indexes since variable 
bindings will change the correct position of data.^ 

The current SICStus Prolog CHR implementation uses only two index mecha- 
nisms: Constraints for a given Functor/Arity are grouped, and variables shared 
between heads in a rule index the constraint store because matching constraints 
must correspondingly share a (attributed) variable. In the HAL CHR version, 
we put some extra emphasis on indexes for ground data: 

The first step in this process is lookup reduction. Given a set of lookups for 
constraint p/fc we reduce the number of lookups by using information about 
properties of p//c: 

— lookup generalization: rather than build specialized indexes for lookups that 
share variables we simply use more general indexes. Thus, we replace any 
lookup p(ui,...,Ufe) where Vi and Vj are the same variable by a lookup 
p(wi, . . . , . . . , Vk) where u' is a new variable. Of course, we 

must add an extra guard Vi = Vj for rules where we use generalized lookups. 
For example, the lookup eq(X,X) can use the lookup for eq(X,Y), followed 
by the guard X = Y. 

^ Currently HAL only supports CHRs with fixed arguments (although these might be 
variables from another (non-Herbrand) solver). 




80 



C. Holzbaur et al. 



— functional dependency reduction: we can use functional dependencies to re- 

duce the requirement for indexes. We can replace any lookup p(ui, . . . ,Ufe) 
where there is a functional dependency p{x\, . . . xj 

and Ujj , . . . , Vi^ are input (as opposed to anonymous) variables to the query 
by the lookup p(ui, . . . , Vj-i, Vj+i, . . . , Vk)- For example, consider the con- 
straint bounds (X,L,U) which stores the lower L and upper U bounds for a 
constrained integer variable X. Given functional dependency bounds{X ,L,U):: 
X ^ L, the lookup bounds (X,L,_) can be replaced by bounds (X,_,_). 

— symmetry reduction: if p/fc is symmetric on arguments i and j we have two 
symmetric lookups p(ui , . . . ,Vi, . . . ,vj, . . . ,Vk) and p(u'i, . . . , u', . . . , u' , . . . , 
v'jJ where vi = v[ for I < I < k,l i,l ^ j and Vi = u' and vj = v[ then 
remove one of the symmetric lookups. For example, if eq/2 is symmetric the 
lookup eq(_,Y) can use the index for eq(X,_). 

We discuss how we generate functional dependency and symmetry informa- 
tion in Section 5. We can now choose the data structures for the indexes that 
support the remaining lookups. The default choice is a balanced binary search 
tree (BST). Note that using a BST we can sometimes merge two indexes, for 
example, a BST for eq(X,Y) can also efficiently answer eq(X,_) queries. 

Normally, the index will return an iterator which iterates through the multi- 
set of constraints that match the lookup. Conceptually, each index thus returns 
a list iterator of constraints matching the lookup.® We can use functional de- 
pendencies to determine when this multiset can have at most one element. This 
is the case for a lookup p(ui, . . . , Vk) with fixed variables Vjj , . . . , Vi^ such that 
fdc\ose{{xi^, . . . ,Xi^},FDp) D {xi, . . . ,Xk} where FDp are the functional de- 
pendencies for p/fc. For example, the lookup bounds (X,_,_) returns at most one 
constraint given the functional dependencies: bounds{X, L^U) :: X L and 
bounds{X, L, U) :: X ^ U. Iterators with at most one entry can return a yesno 
iterator rather than a list. 

Since, in general, we may need to store multiple copies of identical constraints 
(CHR rules accept multisets rather than sets of constraints) each constraint 
needs to be stored with a unique identifier, called the constraint number. Code 
for the constraint will generate a new identifier for each new active constraint. 

Each index for p(vi, . . . ,Vk), where say the fixed variables are . . . ,Vi^, 
needs to support operations for initializing a new index, inserting and delet- 
ing constraints from the index and returning an iterator over the index for 
a given lookup. Note that the constraint number is an important extra argu- 
ment for index manipulation. The compiler generates code for the predicates 
p_insert_constraint and p_delete_constraint which insert and delete the 
constraint p from each of the indexes in which it is involved. 

Example 2. Suppose a CHR constraint eq/2 has lookups eq(X,_) and eq(_,Y), 
and eq/2 is known to be symmetric in its two arguments. We can remove the 

® Some complexities arise with the insertion and deletion of constraints during the 
execution of the iterator, because once a rule commits, the changes to the store 
regarding the removal of constraints and the addition of the active constraint have 
to take effect to implement an “immediate update view” . 
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gcd_3(M,CNl) 

(gcd_index_exists_iteration(N,CN2) , 
M >= N, CNl != CN2 -> •/."/. guard 
gcd_delete_constraint (M, CNl) , 
gcd(M-N), •/;/. RHS 
gcd_3_succ_cont (MjCNl) 

; gcd_3_f ail_cont (M, CNl) ). 

gcd_2(N,CNl) 

gcd_index_iteration_init (10) , 
gcd_2_f orall_iterate (N,CN1 , 10) . 



gcd_2_f orall_iterate (N,CN1 , 10) : - 
gcd_iteration_last (10) , 
gcd_2_succ_cont (N,CN1) ) . 
gcd_2_f orall_iterate (N,CN1 , 10) : - 
gcd_iteration_next(10, M, CN2, II), 
(M >= N, CNl != CN2 -> It guard 
gcd_delete_constraint (M,CN2) , 
gcd(M-N) n RHS 
; true) , rule did not apply 
gcd_2_f orall_iterate (N,CN1 , II) . 



Fig. 2. Code for existential partner search and universal partner search. 



lookup eq(_, Y) in favor of the symmetric eq(X, _) , and then use a single balanced 
tree index for eq(X,Y) to store eq/2 constraints since this can also efficiently 
retrieve constraints of the form eq(X,_). 

3.3 Code Generation for Individual Occurrences 
of Active Constraints 

Once we have determined the join order for each rule and each active constraint, 
and the indexes available for each constraint, we are ready to generate code 
for each occurrence of the active constraint. Two kinds of searches for partners 
arise: A universal search iterates over all possible partners. This is required for 
propagation rules where the rule fires for each possible matching partner. An 
existential search looks for only the first possible set of matching partners. This 
is sufficient for simplification rules where the constraints found will be deleted. 

We can split the constraints appearing on the left-hand-side of any kind of 
rule into two sets: those that are deleted by the rule {Remove), and those that 
are not {Keep). The partner search uses universal search behavior, up to and 
including the first constraint in the join which appears in Remove. From then 
on the search is existential. If the constraint has a functional dependency that 
ensures that there can be only one matching solution, we can replace universal 
search by existential search. 

For each partner constraint we need to choose an available index for finding 
the matching partners. Since we have no selectivity or cardinality information, 
we simply choose the index with the largest intersection with the lookup. 

Example 3. Consider the compilation of the 3rd occurrence of a gcd/1 constraint 
in the program in the introduction (the second occurrence in (T6)) which is 
to be removed. Since the active constraint is in Remove the entire search is 
existential. The compilation produces the code gcd_3 shown in Figure 2. The 
predicate gcd_index_exists_iteration iterates non-deterministically through 
the gcd/1 constraints in the store using the index (on no arguments). It returns 
the value of the gcd/1 argument as well as its constraint number. Next, the 
guard is checked. Additionally, we check that the two gcd/1 constraints are in 
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fact different by comparing their constraint numbers (CNl ! = CN2). If a partner 
is found, the active constraint is removed from the store, and the body is called. 
Afterwards, the success continuation for this occurrence is called. If no partner 
is found the failure continuation is called. 

The compilation for second occurrence of a gcd/1 constraint (the first occur- 
rence in (T6)) requires universal search for partners. The compilation produces 
the code gcd_2 shown in Figure 2. The predicate gcd_index_iteration_init, 
returns an iterator of gcd/1 constraints resulting from looking up the index. 
Calls to gcd_iteration_last and gcd_iteration_next succeed if the iterator 
is finished and return values of the last and next gcd/1 constraint (and its con- 
straint number) as well as the new iterator. 

3.4 Joining the Code Generated for Each Constraint Occurrence 

After generating the code for each individual occurrence, we must join it all 
together in one piece of code. The code is ordered according to the textual 
order of the associated constraint occurrences except for simpagation rules where 
occurrences after the \ symbol are ordered earlier than those before the symbol 
(since they will then be deleted, thus reducing the number of constraints in the 
store). Let the order of occurrences be oi, ... ,Om- The simplest way to join the 
individual rule code for a constraint p/k is as follows: Code for p/k creates a 
new constraint number and calls the first occurrence of code p-Oi/k + 1. The 
fail continuation for p_Oj/k -I- 1 is set to p_Oj+i/fc -I- 1. The success continuation 
for p_Oj/k -I- 1 is also set to p-Oj+i/k + 1 unless the active constraint for this 
occurrence is in Remove in which case the success continuation is true, since 
the active constraint has been deleted. 

Example 4- For the gcd program the order of the occurrences is 1, 3, 2. The fail 
continuations simply reflect the order in which the occurrences are processed: 
gcd_l continues to gcd_3 which continues to gcd_2 which continues to true. 
Clearly, the success continuation for occurrences 1 and 3 of gcd/1 are true 
since the active constraint is deleted. The success continuation of gcd_2 is true 
since it is last. The remaining code for gcd/1 is given in Figure 3.® 



4 Improving CHR Compilation 

In the previous section we examined the basics steps for compiling CHRs taking 
advantage of type, mode, functional dependency and symmetries information. In 
this section we explore other kinds of optimizations based on analysis of CHRs. 

4.1 Continuation Optimization 

We can improve the simple strategy for joining the code generated for each 
occurrence of a constraint by noticing correspondences between rule matchings 

Note that later compiler passes remove the overhead of chain rules and empty rules. 
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gcd(N) 

new_constraint_number (CNl) , 
gcd_insert_constraint (N,CN1) , 
gcd_l(N,CNl) . 
gcd_l(N,CNl) 

(N = 0 -> •/.•/. Guard 

gcd_delete_constraint (N , CNl ) 
true, y.y. RHS 
gcd_l_succ_cont (N,CN1) 

; gcd_l_fail_cont (N,CN1) ) . 



gcd_l_succ_cont . 

gcd_l_fail_cont(N,CNl) gcd_3(N,CNl) . 

gcd_3_succ_cont . 

gcd_3_fail_cont(N,CNl) gcd_2(N,CNl) . 

gcd_2_succ_cont . 

gcd_2_f ail_cont . 



Fig. 3. Initial code, code for first occurrence and continuation code for gcd. 



for various occurrences. Suppose we have two consecutive occurrences with active 
constraints, partner constraints and guards given by the triples {p{x),c,g) and 
{p{y),c',g') respectively. Suppose we can prove that \= {x = y A (3yc' A g')) -A 
^xcAg (where indicates the existential quantification of F for all its variables 
not in set V). Then, anytime the first occurrence fails to match the second 
occurrence will also fail to match, since the store has not changed meanwhile. 
Hence, the fail continuation for the first occurrence can skip over the second 
occurrence. We can use whatever reasoning we please to prove the implication. 
Currently, both the SICStus and HAL version of the CHR compiler use very 
simple implication reasoning about identical constraints and true. 

Example 5. Consider the following rules which manipulate bounds (X,L,U) con- 
straints. 

ne @ bounds (X,L,U) ==> U >= L. 

red @ bounds(X,Ll,Ul) \ bounds (X,L2 ,U2) <=> LI >= L2, U1 <= U2 I true, 
int @ bounds(X,Ll,Ul) , bounds (X,L2,U2) <=> bounds (X,max(Ll ,L2), min (U1 ,U2) ) . 

For the 4th and 5th occurrences in rule intersect the implication 

{X4 = X5 A 3L24, U24bounds{X4, L24, C/24)) — >■ 3LI5, Ul^bounds{X^, LI5, U I5) 

(where we use subscripts to indicate which is the active occurrence) holds. Hence, 
the 5th occurrence will never succeed if the 4th fails. Since if the 4th succeeds 
the active constraint is deleted, the 5th occurrence can be omitted entirely. 



4.2 Late Storage 

The first action in processing a new active constraint is to add it to the store, 
so that when it fires, the store has already been updated. In practice, this is 
inefficient since it may quite often be immediately removed. We can delay the 
addition of the active constraint until just before executing a right-hand-side 
that does not delete the active constraint, and can affect the store (i.e., may 
make use of the CHR constraints in the store). 
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Example 6. Consider the compilation of gcd/1. The first and third occurrences 
delete the active constraint. Thus, the new gcd/ 1 constraint need not be stored 
before they are executed. It is only required to be stored just before the code 
for the second occurrence. The call to gcd_insert_constraint can be moved to 
the beginning of gcd_2, while the calls to gcd_delete_constraint in gcd_l and 
gcd_3 can be removed. 

This information can be inferred with a simple pre-analysis. For simplicity, 
we can consider a rule as rhs- affects- store if its right-hand-side calls a CHR 
constraint, or a local predicate which calls constraints (directly or indirectly), or 
(to be safe) an external predicate which is not a library predicate. 

4.3 Set Semantics 

Although CHRs use a multiset semantics, often the constraints defined by CHRs 
have a set semantics, where the number of copies of a constraint does not matter. 
In the HAL version, indexes for constraints with set semantics can take advan- 
tage of this information (by not worrying about duplicates). We can recognize 
constraints with set semantics in two different ways. 

A constraint p/k has set semantics if there is a rule which explicitly removes 
duplicates of constraints. That is, if there exists a rule of the form 

p{xi,...,xk)\p{yi,...,yk) g\true 

such that 1= xi = yi A ■■■ Xk = yk -A ^xvjy9 which occurs before any rule 
requires p/fc to be stored or which can match two identical copies of p/fc. For 
instance, the rule red from Example 5 ensures that any new active bounds/3 
constraint identical to one already in the store will be deleted (it also deletes 
other redundant bounds information). 

A constraint also has set semantics if all rules in which it appears behave 
the same even if duplicates are present. This is a very common case since CHRs 
are used to build constraint solvers which (by definition) should treat constraint 
multisets as sets. Thus, a constraint p/k also has set semantics if 

— there are no rules which can match two identical copies of p/fc 

— there are no rules that delete a constraint p/k without deleting all identical 
copies. 

— there are no rules with occurrences of p / k that can generate constraints (on 
the rhs) which do not have set semantics. 

A simple fixpoint analysis can detect such constraints starting from the assump- 
tion that all constraints have set semantics. 

For constraints p/k having this form we can safely add a rule of the form 

p{xi,...,Xk)\p{xi,...,Xk) true. 

This will avoid redundant work when duplicate constraints are added. 

Example 1. Consider a constraint eq/2 (for equality) defined by the CHR 
eq (X , Y) , bounds (X , LX , UX) , bounds ( Y , LY , UY) ==> bounds ( Y , LX , UX) , bounds (X , LY , UY) . 
Then, since bounds/3 has set semantics, eq/2 also has set semantics. 
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5 Determining Functional Dependencies and Symmetries 

In previous sections we have either explained how to determine the information 
used for an optimization (as in the case of rules which are rhs-affects-store) or 
assumed it was given by the user or inferred by the compiler in the usual way 
(as in type, mode and determinism). The only two exceptions (functional depen- 
dencies and symmetries) were delayed in order not to clutter the explanation of 
CHR compilation. The following two sections examine how to determine these 
two properties. 



5.1 Functional Dependencies 

Functional dependencies occur frequently in CHRs since they encode functions 
using relations. Suppose p/fc need not be stored before occurrences in a rule of 
the form 



p{xi, xi,yi+i, ■..,yk)[, \]p{xi, . . . , xi,zi+i, ...,Zk) di,...,dm 

where Xi,l < i < I and yi, Zi,l + 1 < i < k are distinct variables. Then, 
this rule ensures that there is at most one constraint in the store of the form 
p{xi, ... ,xi, d) at any time. This corresponds to the functional dependen- 
cies p{xi , . . . , Xk) ■■ {xi, . . . ,xi) Xi,l + 1 < i < k. For example, the rule int 
of Example 5 illustrates the functional dependencies bounds{X, L, U) :: X L 
and bounds{X, L, U) :: X U . 

We can detect more functional dependencies if we consider multiple rules of 
the same kind. For example, the rules 

p{xi,...,xuyi+i,...,yk)l\\p{xi,...,xuzi+i,...,Zk) gi\di , . . . , dm 

p(xi,...,a;i,y(+i,..., ?/(,)[, \]p(xi,...,x/,z(+i,...,z(,) g2\d\, . ■ ■ , d'^' 

also lead to functional dependencies if |= {y = y' f\z = z' ^ ( 5 i V 32 ) is provable. 

Example 8. The second rule for gcd/ 1 written twice illustrates the functional 
dependency gcd(A^) :: 0 since N = M' A M = N' ^ {M > N V M' > N') 
holds: 



gcd(N) \ gcd(M) <=> M >= N I gcd(M - N) . 
gcd(N’) \ gcd(M’) <=> M’ >= N’ I gcd(M’ - N’). 



Making use of this functional dependency for gcd/1 we can use a single global 
yesno integer value ($Gcd) to store the (at most one) gcd/1 constraint, we can 
replace the forall iteration by exists iteration, and remove the constraint numbers 
entirely. The resulting code (after unfolding) is 



gcd(X) 

(X = 0 -> true 
; (yes(N) = $Gcd, X >= N 
gcd(X-N) 

; (yes(M) = $Gcd, M >= X 
$Gcd : = yes (X) , 
gcd(M-X) 

; $Gcd := yes(X)))). 



"/,"/o occ 1 : guard -> rhs 

occ 3: gcd_index_exists_iteration, guard 
’/."/o occ 3: rhs 

’/,"/o occ 2: gcd_f orall_iterate , guard 
’/."/o occ 2: gcd_insert_constraint 
’/,"/o occ 2 : rhs 
late insert 
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5.2 Symmetry 

Symmetry also occurs reasonably often in CHRs. There are multiple ways of 
detecting symmetries. A rule of the form 

p{xi,X 2 ,...,Xk) p{x 2 ,Xi,...,Xk) 

that occurs before any rule that requires p/fc to be inserted induces a symme- 
try for constraint p{x\, . . . ,Xk) on xi and X 2 , providing that no rule eliminates 
p{xi,X 2 , ■■ - ,Xk) and not p{x 2 ,xi, . . . ,Xk)- 

Example 9. Consider a !=/2 constraint defined by the rules: 

neqset @X !=Y\X !=Y <=> true . 
neqsym @ X != Y ==> Y != X. 

neqlower @ X != Y, bounds(X, VX,VX) , bounds (Y,VX,UY) ==> bounds (Y,VX+1 ,UY) . 
nequpper @ X != Y, bounds(X, VX,VX) , bounds (Y,LY,VX) ==> bounds (Y, LY, VX-1) . 

the rule neqsym 0 X != Y => Y != X illustrates the symmetry of !=/2w.r.t. X 
and Y, since in addition no rule deletes a (non-duplicate) !=/2 constraint. 

A constraint may be symmetric without a specific symmetry adding rule. 
The general case is complicated and, for brevity, we simply give an example. 

Example 10. The rule in Example 7 and its rewriting with {X Y,Y i— X} 
are logically equivalent (they are variants illustrated by the reordering of the 
rule). 

eq (X , Y) , bounds (X , LX , UX) , bounds ( Y , LY , UY) ==> bounds ( Y , LX , UX) , bounds (X , LY , UY) . 
eq(Y,X), bounds (Y,LY,UY), bounds (X, LX, UX) ==> bounds (X,LY,UY), bounds (Y, LX, UX) . 

Hence, since this is the only rule for eq/2, the eq/2 constraint is symmetric. 

6 Experimental Results 

Our initial version of the HAL CHR compiler performs only some of the optimiza- 
tions outlined above, including join ordering and continuation optimization and 
late storage. We do not yet have the automated analysis to support discovery of 
functional dependencies, set semantics and symmetries, nor specialized indexes 
(which rely on this information). It is ongoing work to improve the compiler, to 
perform the appropriate analyses, and make use of the information during the 
compilation. 

To get an estimate on the benefits achievable through the optimizing compi- 
lation, we have modified the code produced by the current compiler by hand, in 
a way as close as possible to how we foresee its implementation. The comparisons 
against the SICStus CHR versions primarily serve as a simple “reality check” 
for the HAL version in its infancy. Any deductions beyond that would have to 
consider all the differences between SICStus and HAL producing Mercury code. 
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Table 1. Execution times (ms) for various optimized versions of CHR programs. 
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57435 


= 
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98 


88 


= 


180220 


= 
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10576 


1209 


1061 


— 
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— 



We compare the performance on 3 small programs: 

— gcd as described in the paper, where the query (a,b) is gcd(a) ,gcd(6). 

— interval: a simple bounds propagation solver on N-queens; where the query 
(a, b) is for a queens with each constraint added b times (usually 1, just here 
to illustrate the possible benefits from set semantics). 

— dfa: a visual parser for DFAs building the DFA from individual graphics 
elements, e.g. circles, lines and text boxes. The constraints are all ground, and 
the compilation involves a single (indexable) lookup /me(_, T), has a single 
symmetry line{X,Y) = line{Y,X) and no constraints have set semantics. 
In this program the rules are large multi-ways joins, e.g., the rule to detect 
an arrow from one state to another is: 

circle(Cl,Rl) , circle (C2 ,R2) \ 

line(Pl,P2), line(P3,P2), line(P4,P2), text(P5,T) <=> 

point _on_circle (PI ,C1 ,R1) , point_on_circle(P2,C2 ,R2) , 
midpoint (PI ,P2,P12) , near(P12,P5) I arrow (PI ,P2 ,T) . 

The query a finds a (constant) small DFA (of 10 elements) in a large set of 
a graphical elements (to illustrate indexing) . 

The results are shown in Table 1. All timings are the average over 20 runs 
on a dual Pentium II-400MHz with 384M of RAM running under Linux RedHat 
5.2 with kernel version 2.2, and are given in milliseconds. SICStus Prolog 3.8.4 
is run under compact code (there is no fastcode for Linux). 

For gcd we first give times for the original output of the compiler Orig. In the 
version +yesno the list storage of constraints is replaced by a +yesno structure 
(using the functional dependency). We can see a significant improvement here 
by just avoiding some loop overhead. Next in +det the determinism of produced 
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code is modified to take into account the functional dependency. Here we can 
really see the benefits of taking advantage of the functional dependency. Finally 
in +nn constraint numbers are completely removed (and this massively simplifies 
the resulting code). We also give hand which uses the code in Example 8 (as a 
lower bound on where optimization can reach), and SICS the original code in 
SICStus Prolog 3.8.4. 

In the second experiment we give: the original code Orig, +tree where all list 
indexes have been replaced by 234 trees, +sym where symmetric constraints have 
the symmetry handled by indexes, and +eq where set semantics optimizations 
are applied (note that an = means the code is identical, i.e. there was no scope for 
the optimization). Finally, we compare with the SICStus Prolog CHR compiler 
SICS, and for the interval example, a nonground version of the program 
SICS-V which uses attributed variable indexing (the other benchmarks involve 
only ground constraints). 

The advantage of join ordering is illustrated by the difference between HAL 
and SICStus on df a, the principle difference here is simply the better join or- 
dering and early guard scheduling of HAL. 

Adding indexes is clearly important when there are a significant number 
of constraints and effective lookups. In dfa, since there is only one indexed 
lookup, if the constraint stores are too small the overhead of the trees nullifies 
the advantages of the lookup. As the number of elements grows the advantages 
become clear. 

Handling symmetry turned out to be disappointing. While it can reduce the 
number of indexes, it doubles their size and hence the possible benefit is limited. 
The overhead of managing symmetry in the index overwhelms the advantages 
when the constraint store is small, the advantages only becomes visible when the 
constraint store grows very large (dfa 3000). The handling of set semantics is of 
considerable benefit when duplicate constraints are actually added, and doesn’t 
add significant overhead when there are no duplicate constraints, hence it seems 
worthwhile. 

7 Conclusion and Future Work 

The core of compiling CHRs is a multi-way join compilation. But, unlike the 
usual database case, we have no information on the cardinality of relations and 
index selectivity. We show how to use type and mode information to compile 
efficient joins, and automatically utilize appropriate indexes for supporting the 
joins. We show how functional dependencies and symmetries can improve this 
compilation process. We further investigate how, by analyzing the CHRs them- 
selves we can find other opportunities for improving compilation, as well as 
determined functional dependencies, symmetries and other algebraic features of 
the CHR constraints. 
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Abstract. Experience using constraint programming to solve real-life 
problems has shown that finding an efficient solution to a problem often 
requires experimentation with different constraint solvers or even build- 
ing a problem-specific solver. HAL is a new constraint logic programming 
language expressly designed to facilitate this process. In this paper we 
examine different ways of building solvers in HAL. We explain how type 
classes can be used to specify solver interfaces, allowing the constraint 
programmer to support modelling of a constraint problem independently 
of a particular solver, leading to easy “plug and play” experimentation. 
We compare a number of different ways of writing a simple solver in HAL: 
using dynamic scheduling, constraint handling rules and building on an 
existing solver. We also examine how external solvers may be interfaced 
with HAL, and approaches for removing interface overhead. 



1 Introduction 

There is no single best technique for solving combinatorial optimization and 
constraint satisfaction problems. Thus, constraint programmers would like to be 
able to easily experiment with different constraint solvers and to readily develop 
new problem-specific constraint solvers. The new constraint logic programming 
(CLP) language HAL [3] has been specifically designed to allow the user to easily 
experiment with different constraint solvers over the same domain, to support 
extension of solvers and construction of hybrid solvers, and to call procedures 
(in particular, solvers) written in other languages with little overhead. 

In order to do so, HAL provides semi-optional type, mode and determinism 
declarations for predicates and functions. These allow the generation of efficient 
target code, ensure that solvers and other procedures are being used in the cor- 
rect way, and facilitate efficient integration with foreign language procedures. 
Type information also means that predicate and function overloading can be 
resolved at compile-time, allowing a natural syntax for constraints. To facilitate 
writing simple constraint solvers, extending existing solvers and combining them, 
HAL provides dynamic scheduling by way of a specialized delay construct which 
supports definition of “propagators.” Finally, HAL provides “global variables” 
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which allow efficient implementation of a persistent constraint store. They be- 
have in a similar manner to C’s static variables and are only visible within the 
module in which they are defined. 

The initial design of HAL was described in [3] . The current paper extends 
this in five main ways. First, we describe how the addition of type classes pro- 
vides a natural way of specifying a constraint solver’s capabilities and, therefore, 
support for “plug and play” with solvers. Second, we give a more detailed de- 
scription of how HAL supports solver dependent dynamic scheduling and its use 
in writing solvers. Third, we describe how to integrate foreign language solvers 
into HAL and some programming techniques for reducing the runtime overhead 
of the solver interface. Fourth, we discuss the integration of constraint handling 
rules (CHRs) into HAL. Finally, we compare the efficiency of solvers written in 
HAL using CHRs, dynamic scheduling and type classes with comparable solvers 
written in SICStus and compare the overhead of HAL’s external solver interface 
for CPLEX with that of ECLiPSe. 

Thus, the main focus of the current paper is how to provide generic mecha- 
nisms such as type classes, dynamic scheduling and CHRs for structuring, writ- 
ing and extending constraint solvers in a constraint programming language with 
type, mode and determinism declarations and what, if any, performance advan- 
tage is provided by this additional information. 



2 The HAL Language 

In this section we provide a brief overview of HAL [3], a CLP language which 
is compiled to the logic programming language Mercury [15].^ The basic HAL 
syntax follows the standard CLP syntax, with variables, rules and predicates 
defined as usual (see, e.g., [14] for an introduction to CLP). The module system 
in HAL is similar to that of Mercury. A module is defined in a file, it imports the 
modules it uses and has export annotations on the declarations of the objects 
that it wishes to be visible to those importing the module. Selective importation 
is also possible. The core language supports the basic integer, float, string, and 
character data types plus polymorphic constructor types (such as lists) based 
on these basic types. This support is, however, limited to assignment, testing 
for equality, and construction and deconstruction of ground terms. More sophis- 
ticated constraint solving is provided by importing a constraint solver for each 
type involved. 

As a simple example, the following program is a HAL version of the now 
classic CLP program mortgage. 



- module mortgage. (Tl) 

- import simplex. (T 2 ) 

- export pred mortgage(cf loat , cf loat , cf loat , of loat , cf loat) . (T 3 ) 

mode mortgage (00,00,00,00,00) is nondet . (T 4 ) 



^ The key difference between them is that Mercury does not support constraints and 
constraint solvers. In fact, Mercury only provides a limited form of unification. 
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mortgage (P,0 . 0, 1 ,R,P) . (^1) 

mortgage (P,T, I ,R,B) T >= 1.0, mortgage(P+P*I-R,T-l . 0, I ,R,B) . (^2) 

Line (LI) states that this file defines the module mortgage. Line (L2) im- 
ports a module called simplex. This module provides a simplex-based linear 
arithmetic constraint solver for constrained floats, called cf loats. Line (L3) ex- 
ports the predicate mortgage which takes five cf loats as arguments. This is the 
type declaration for mortgage. A type specifies the representation format of a 
variable. Thus, for example, the type system distinguishes between constrained 
floats (cfloat) and standard numerical floats (float) since they have a dif- 
ferent representation. Types are defined using (polymorphic) regular tree type 
statements. For instance, the list/1 constructor type is defined by 

typedef list(T) -> [] ; [T|list(T)]. 

Line (L4) provides a mode declaration for mortgage. Mode declarations 
associate a mode with each argument of a predicate. A mode is a mapping 
Insti — >■ Inst 2 where Inst\ and Inst 2 describe the instantiation of an argument 
on call and on success from the predicate, respectively. The base instantiations 
are new, old and ground. Variable X is new if it has not been seen by the con- 
straint solver, old if it has, and ground if X is constrained to take a fixed value. 
Note that old is interpreted as ground for variables of non-solver types (i.e., 
types for which there is no solver). The base modes are mappings from one base 
instantiation to another: we use two letter codes (oo, no, og, gg, ng) based on the 
first letter of the instantiation, e.g. ng is new^ground. The standard modes in 
and out are renamings of gg and ng, respectively. Therefore, line (L4) declares 
that each argument of mortgage has mode oo, i.e., takes an old variable and 
returns an old variable. 

More sophisticated instantiations (lying between old and ground) may be 
used to describe the state of complex terms. Instantiation definitions look like 
type definitions. For example, the instantiation definition 

instdef f ixed_length_list -> ( [] ; [old I f ixed_length_list] ) . 

indicates that the variable is bound to either an empty list or a list with an old 
head and a tail with the same instantiation. 

Line (L4) also states the determinism for this mode of mortgage, i.e., how 
many answers it may have. We use the Mercury hierarchy: nondet means any 
number of solutions; multi at least one solution; semidet at most one solution; 
det exactly one solution, failure no solution and erroneous a run-time error. 

The rest of the file contains the standard two rules defining mortgage. 



3 Constraint Solvers and Type Classes 

Type classes [13,17] support constrained polymorphism by allowing the program- 
mer to write code which relies on a parametric type having certain associated 
predicates and functions. More precisely, a type class is a name for a set of types 
for which certain predicates and/or functions, called the methods, are defined. 
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Type classes were first introduced in functional programming languages Haskell 
and Clean, while Mercury and CProlog were the first logic programming lan- 
guages to include them [12,4]. We have recently extended HAL to provide type 
classes similar to those in Mercury. One major motivation is that they provide a 
natural way of specifying a constraint solver’s capabilities and, therefore, support 
for “plug and play” with solvers. 

A class declaration defines a new type class. It gives the names of the type 
variables which are parameters to the type class, and the methods which form 
its interface. As an example, one of the most important built-in type classes in 
HAL is that defining types which support equality testing: 

class eq(T) where [ 
pred T = T, 

mode oo = oo is semidet ] . 

Instances of this class can be specified, for example, by the declaration 

instance eq(int) . 

which declares the int type to be an instance of the eq/ 1 type class. For this to 
be correct, the module must either define the method =/2 with type int=int and 
mode 00=00 is semidet in the current module or indicate that it is a renaming 
of some other predicate. 

We note that all types in HAL (like Mercury) have an associated “equality” 
for modes in=out and out=in, since these correspond to an assignment. Most 
types also support testing for equality, the main exception being for types that 
contain higher-order predicates. Thus, by default, HAL automatically generates 
instance declarations of the above form and the definition of =/2 methods for 
all constructor types which contain types supporting equality. 

Type classes allow us to naturally capture the notion of a type having an 
associated constraint solver: It is a type for which there is a method for initialis- 
ing variables and a method for true equality. Thus, we define the solver/1 type 
class to be: 

class solver (T) <= eq(T) where [ 
pred init(T), 
mode init(no) is det ]. 

This indicates that the solver/ 1 type class provides an initialisation method 
init/1 and that solver/1 is a subclass of eq/1 and, thus, any instance of 
solver/ 1 must also be an instance of eq/1. Therefore, for type T to be in the 
solver/1 type class, there must exist predicates init/1 and =/2 for this type 
with mode and determinism as shown. 

Constructor types can be automatically declared to be instances of the 
solver/1 type class using the notation deriving solver. The compiler then 
automatically generates an appropriate instance declaration and the predicate 
init/1. Variables whose type is not an instance of solver/1 are not true logic 
variables, i.e., they are like Mercury terms since they must either be new or 
bound to a functor of the type. Thus, the type declaration given earlier for lists 
defines Mercury terms which have a fixed length while 
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typedef hlist(T) -> [] ; [T|hlist(T)] deriving solver, 
defines true “Herbrand” lists. 

Class constraints can appear as part of a predicate or function’s type signa- 
ture. They constrain the variables in the type signature to belong to particular 
type classes. Class constraints are checked and inferred during the type checking 
phase except for those of type classes solver/1 and eq/1 which must be treated 
specially because they might vary for different modes of the same predicate. In 
the case of solver/1, this will be true if the HAL compiler inserts appropriate 
calls to init/1 for some modes (those in which the argument is initially new) but 
not in others. In the case of eq/1, this will be true if equalities are found to be 
assignments or deconstructions in some modes but true equalities in others. As a 
result, it is not until after mode checking that we can determine which variables 
in the type signature should be instances of eq/1 and/or solver/1. Unfortu- 
nately, mode checking requires type checking to have taken place. Hence, the 
HAL compiler includes an additional phase after mode checking, where newly 
inferred solver/1 and eq/1 class constraints are added to the inferred types of 
procedures for modes that require them. Note that, unlike for other classes, if 
the declared type for a predicate does not contain the inferred class constraints, 
this is not considered an error, unless the predicate is exported.^ 

To illustrate the problem, consider the predicate 

pred append(list (T) , list (T) , list (T) ) . 
mode append ( in , in , out ) is det . 
mode append (in, out , in) is semidet . 
appendC [] ,Y,Y) . 

append([A|Xl] , Y, [A|Z1]) append (XI ,Y, Zl) . 

During mode checking, the predicate append is compiled into two different pro- 
cedures, one for each mode of usage (indicated by the keyword implemented_by). 
Conceptually, the code after mode checking is 

pred append(list (T) , list (T) , list (T) ) implemented_by [append_l , append_2] . 
mode append_l ( in , in , out ) is det. 
append_l(X,Y,Z) X =:= [], Z := Y. 

append_l(X,Y,Z) X =: [A|X1], append.l (XI , Y, Zl) , Z := [A|Z1]. 

mode append_2(in, out , in) is semidet. 
append_2(X,Y,Z) X =:= [], Y := Z. 

append_2(X,Y,Z) X =: [A|X1], Z =: [B|Z1], A =:= B, append_2(Xl ,Y,Z1) . 

where =:=, :=, =: indicate calls to =/2 with mode (in, in), (out, in) and 
(in, out), respectively. It is only now that we see that for the second mode 
the parametric type T must allow equality testing (be an instance of the eq/ 1 
class), because we need to compare A and B. Thus, in an additional phase of type 
inference the HAL compiler infers 

pred append_2 (list (T) , list (T) , list (T) ) <= eq(T) . 

^ Exported predicates need to have all their information available to ensure correct 
modular compilation. We plan to remove this restriction when the compiler fully 
supports cross module optimizing compilation [1]. 
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Any predicates calling append_2 will also inherit the eq(T) class constraint in 
their type. 

This is a new problem for type classes and multi-moded predicates which does 
not arise in functional programming. While the same problem arises in Mercury 
for equality, it is side-stepped by not supporting an anologue of the eq/1 class: 
effectively all types are required to support equality for mode =(in,in). This 
may lead to run-time errors (e.g., when using append_2 on lists of predicates). 
Since such errors are caught at compile-time by our two phase scheme, we believe 
our approach provides a better solution. 

HAL provides a hierarchy of pre-defined type classes for common constraint 
domains which derive from the solver type class: bool_solver, linf loat_sol- 
ver, f loat_solver, linint_solver, and int_solver. They provide a standard 
interface to solvers, thus facilitating “plug and play” experimentation by allowing 
separate compilation of the constraint models from the solvers that they use. As 
a result, we can rewrite the type declaration for mortgage to 

export pred mortgaged, T, T, T, T) <= float_solver(T) . 

thus allowing it to use any solver defined as an instance of f loat_solver. 

Other important subclasses of the solver type class are herbranid (which 
includes as instances all constructor types declared as deriving solver) and 
its subclass the prolog type class. The role of the herbrand type class is to 
distinguish between constructor types and other user defined solver types. The 
prolog class requires the type to support a number of non-logical operations 
commonly used in Prolog style programming. For instance, it provides var/ 1 
and nonvar/ 1 to test if a variable is still uninstantiated or not and standard 
functions to access the components of a term such as functor. It also provides 
the method ===/2 which succeeds only if both its arguments are variables and 
constrained to be equal. 

A constructor type can be declared to support the Prolog built-ins by an- 
notating the type declaration with deriving prolog rather than deriving 
solver. In this case, the compiler automatically generates definitions for a 
prolog class methods as well as those for the solver class. Distinguishing be- 
tween herbratnd and prolog allows the HAL compiler to differentiate between 
types which are used logically from those which are not (useful for optimization) . 

4 Dynamic Scheduling 

An important feature of the HAL language is a form of “persistent” dynamic 
scheduling designed specifically to support constraint solving. A delay construct 
is of the form 

condi ==> goah II • • • II condn ==> goal„ 

where the goal goali will be executed when delay condition condi is satisfied. 
By default, delayed goals remain active and are reexecuted whenever their delay 
condition becomes true again. This is useful, for example, if the delay condition 
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is “the lower bound has changed.” However, delayed goals may also contain calls 
to the special predicate kill/0 which kills all delayed goals in the immediate 
surrounding delay construct; that is, these goals will no longer be active. 

The delay construct of HAL is designed to be extensible, so that programmers 
can build constraint solvers that support delay. In order to do so, one must create 
an instance of the delay/2 type class defined as follows: 

class delay _id(I) where [ 
pred get_id(I) , 
mode get_id(out) is det , 
pred kill (I) , 
mode kill (in) is det ] . 

class delay(D,l) <= delay_id(l) where [ 
pred delay (D, 1, pred), 

mode delayCoo, in, inCpred is semidet)) is semidet ]. 

where type 1 represents the unique identifier (id) of each delay construct, 
get_id/l returns an unused id, kill/1 causes all goals delayed for the input 
id to no longer wake up, type D represents the supported delay conditions, and 
delay/3 takes a delay condition, an id and a goal,^ and stores the information 
in order to execute the goal whenever the delay condition holds. 

The separation of the delay type class into two parts allows different solver 
types to share delay ids. Thus, we can build delay constructs which involve more 
than one solver as long as they use a common delay id (the original design of 
delay [3] did not allow this). 

The HAL compiler translates the delay construct into the base delay methods 
provided by the classes. Thus, the delay construct shown above is translated into: 



get_id(Id) , delay (condi , Id, poab) . .... delay (cond„ , Id, poa/„) 

where each call to kill/0 in a goak is replaced by a call to kill (Id). 

Most modern logic programming languages allow predicates or goals to de- 
lay until a particular Herbrand variable is bound or is unified with another 
variable. In HAL a programmer can declare this by including deriving delay 
in the declaration for a constructor type. As when deriving from solver/ 1 or 
prolog/ 1, the compiler will automatically generate the appropriate methods and 
instance declaration for that type. All such types use the common delay condi- 
tions bound(X) , touched(X) and the common delay id type system_delay_id 
and its system defined instance of delay _id. Note that system_delay_id can 
also be used in programmer defined solvers. 

As an example of the use of delay in constructing constraint solvers, the 
following program contains! the code for (part of) a simple Boolean constraint 
solver.^ 

® To simplify analysis, each goalt must be semidet and may not change the instanti- 
ation of variables. As a result, delayed code cannot invalidate the mode and deter- 
minism checking when woken up. 

^ Note the touched delayed goals are included only for illustration, they are not used 
in the experiments. 
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module bool_delay. 
instance bool_solver(boolv) . 

export .abstract typedef boolv -> ( f ; t ) deriving [prolog, delay] . 
export func true — > boolv. 
true — > t . 

export pred and(boolv, boolv, boolv) . 

mode and(oo, 00,00) is semidet . 



and(X,Y,Z) 

( bound(X) ==> kill, (X = f -> Z = f ; Y = Z) 

I bound(Y) ==> kill, (Y = f -> Z = f ; X = Z) 

I bound(Z) ==> kill, (Z = t -> X = t , Y = t ; notboth(X, Y) ) ) . 
export func false — > boolv. 
false — > f. 

pred notboth(boolv, boolv) . 
mode notboth(oo , 00) is semidet. 
notboth(X,Y) 

( bound(X) ==> kill, (X = t -> Y = f ; true) 

I bound(Y) ==> kill, (Y = t -> X = f ; true) 

I touched(X) ==> (X === Y -> kill, X = f ; true) 

I touched(Y) ==> (X === Y -> kill, X = f ; true)). 



The constructor type boolv is used to represent Booleans. Notice how the 
class functions true and false are simply defined to return the appropriate 
value, while the and predicate delays until one argument has a fixed value, and 
then constrains the other arguments appropriately. In the case of notboth we 
also test if two variables are identical. Hence, boolv must be declared as an 
instance of both the prolog type class and the delay type class (and, hence, 
implicitly as an instance of the solver type class.) 



5 Using External Solvers from HAL 

One of the main design requirements on the HAL language is that it should 
readily support integration into foreign language applications and, in particular, 
allow constraint solvers written in other languages to be called with relatively 
little overhead. An example of such a solver is CPLEX [10], a simplex based solver 
supporting linear arithmetic constraints.® This section details our experience 
integrating CPLEX into HAL. 

The HAL interface for CPLEX is built on top of three Mercury predicates: 
the function initialise.cplex which returns a CPLEX solver instance CP, 
the predicate add.columnCCP,?!-) which adds n columns to the tableau, and 
the predicate add.equality (CP, [(ci,ui) , . . . , (c„,v„)] , 6) which adds the 
equation ci • + • • • + c„ • = 6 to the tableau. These predicates wrap the C 

interface functions of CPLEX. This is easy to do since C code can be directly 
written as part of a Mercury predicate body. These predicates also handle trailing 
and restoration of choice points. This is done by using the higher-order predicate 

® CPLEX also provides routines for mixed integer programming and barrier methods 
but we have not yet integrated these. 
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trail/ 1 which places its argument (a predicate closure), on the function trail 
to be called in the event of backtracking to trail/ 1. 

A naive way to write the interface in HAL is as follows. 

module cplex. 

instance linf loat_solver (cf loat) . 

: - import int . 

export .abstract typedef cfloat -> col(int). 
reinst.old cfloat = ground. 

glob.var CPLEX has.type cplex.instance init.value initialise.cplex . 
glob.var VarNum has.type int init.value 0. 
export.only pred init (cfloat) . 

mode init (no) is det . 

init(V) V = col($VarNum) , $VarNum := $VarNum + 1, add_column($CPLEX, 1) . 
export.only pred cfloat = cfloat. 

mode oo = oo is semidet. 

VI = V2 add_equality($CPLEX, [(1 . 0 , VI) , (-1 . 0 , V2)] , 0.0). 

export func cfloat + cfloat — > cfloat. 

VI + V2 — > V3 
init (V3) , 

trust.det add_equality($CPLEX, [(1 . 0, VI) , (1 . 0 , V2) , (-1 . 0 , V3)] , 0.0). 
export func float x cfloat — > cfloat. 

C X VI — > V2 

init (V2) , 

trustdet add_equality($CPLEX, [(C,V1) , (-1 . 0, V2)] , 0.0). 
coerce coerce_f loat (float) — > cfloat. 
export func coerce_f loat (float) — > cfloat. 
coerce_f loat (C) — > V 
init (V) , 

trust.det add_equality($CPLEX, [(1.0,V)], C) . 

The solver type cfloat is a wrapped integer giving the column number of the 
variable in the CPLEX tableau. It is exported abstractly to provide an abstract 
data type, and declared to be an instance of the linear arithmetic constraint 
solver class linf loat.solver. The reinst.old declaration states that the in- 
stantiation old for cfloats must be interpreted as ground inside this module 
reflecting their internal implementation. We use two global variables: CPLEX for 
storing the CPLEX instance, and VarNum for storing the number of variables 
(columns) in the solver. 

The predicate init/1 simply increments the counter VarNum and adds a col- 
umn to the CPLEX tableau. The =/2 predicate adds an equality to the CPLEX 
tableau. Both are designated as export.only, which makes them visible out- 
side the module, but not inside. This avoids confusion with the internal view 
of cfloats as wrapped integers rather than the external view as float variables. 
The function +/2 initialises a new variable to be the result of the addition and 
adds an equality constraint to compute the result. The trust.det annotation 
allows the compiler to pass the determinism check (the solver author knows that 
this call to add.equality will not fail). The linear multiplication function x/2 
is defined similarly to +/2. 
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Since cf loats are constrained floats, it is convenient to be able to use floating 
point constants in place of cf loats. HAL allows the solver programmer to specify 
the automatic coercion of a base type to a solver type. In our example, the coerce 
directive declares that coerce_float is a coercion function and the next three 
lines give its type, mode and definition. 

Unfortunately, this naive interface has a high overhead. One issue is that 
many arithmetic constraints are simple assignments or tests which do not require 
the power of a linear constraint solver. Thus, we can improve the interface by only 
passing “real” constraints to the solver and “solving” simple assignments and 
tests in the interface functions themselves. This can be done easily by redefining 
the cf loat type to wrap either a true variable or a constant value and redefining 
our interface functions to handle the different cases appropriately. 

Another issue is that the interface splits complex linear equations into a 
large number of intermediate constraints and variables. A better approach is 
to have + and x build up a data structure representing the linear constraint. 
More precisely, we can redefine cfloat to be this data structure and for +/2, 
x/2, init/1 and coerce_f loat/1 to build the data structure. As a by product, 
this data structure can also be used to track constants and perform tests and 
assignments in the interface. The modified code is: 

export _abstr act typedef cfloat -> cfloat (float , list (cterm) ) . 
typedef cterm -> (float.int). 
init(V) V = cf loat (0 . 0 , [(1 . 0, $VarNum)] ) , 

$VarNum := $VarNum + 1, add_columii($CPLEX, 1) . 
cfloat (Cl, Vsl) = cfloat (C2, Vs2) 

negate_coeff s (Vs2 , NewVs2) , 
appendCVsl, NewVs2, Terms), 
add_equality($CPLEX, Terms, C2-C1) . 
cfloat (Cl, Vsl) + cfloat (C2, Vs2) — > 

cfloat (C1+C2, Vs) append(Vsl, Vs2, Vs). 

C X cfloat(F, Vs) — > cfloat(C*F,NewVs) multiply_coeffs(C, Vs, NewVs) . 
coerce_f loat (C) — > cfloat (C, [] ) . 

Also, in external solvers such as CPLEX that are not specialized for incre- 
mental satisfiability checking, the usual CLP approach of checking satisfiability 
after each new constraint is added, may be expensive. We can therefore im- 
prove performance by “batching” constraints and requiring the programmer to 
explicitly call the solver to check for satisfiability. 

6 Using Constraint Handling Rnles (CHRs) 

Constraint Handling Rules (CHRs) have proven to be a very flexible formalism 
for writing incremental constraint solvers and other reactive systems. In effect, 
the rules define transitions from one constraint set to an equivalent constraint 
set. Rules are repeatedly applied until no new rule can be applied. Once applied, 
a rule cannot be undone. For more details the interested reader is referred to [6]. 

The simplest kind of rule is a propagation rule of the form 
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Ihs ==> guard I rhs 

where Ihs is a conjunction of CHR constraints, guard is a conjunction of con- 
straints of the underlying language (in practice this is any goal not involving 
CHR constraints) and rhs is a conjunction of CHR constraints and constraints 
of the underlying language. The rule states that if there is a set S appearing 
in the global CHR constraint store G that matches Ihs such that goal guard 
is entailed by the current constraints, then we should add the rhs to the store. 
Simplification rules have a similar form (replacing the ==> with a <=>) and be- 
havior except that the matching set S is deleted from G. A syntactic extension 
allows only part of the Ihs to be eliminated by a simplification rule: 

Ihsi \ lhs 2 <=> guard I rhs 
indicates that only the set matching lhs 2 is eliminated. 

Efficient implementations of CHRs are provided for SICStus Prolog, Eclipse 
Prolog (see [5]) and Java [11]. Recently, they have also been integrated into 
HAL [8]. As in most implementations, HAL CHRs sit on top of the “host” 
language. More exactly, they may contain HAL code and are essentially compiled 
into HAL in a pre-processing stage of the HAL compiler. As a consequence, CHR 
constraints defined in HAL require the programmer to provide type, mode and 
determinism declarations. 

The following program implements part of a Boolean solver implemented in 
HAL using CHRs.® 



module bool_chr. 

instance bool_solver(boolv) . 

export _abstr act 

typedef boolv -> wrap(int) . 

reinst_old boolv = ground. 

: - glob_var VNum 

has_type int init_value 0. 

export _only 
pred init (boolv). 
mode init (no) is det . 
init(V) V = wrap($VNum), 

$VNum : = $VNum + 1 . 



export constraint true (boolv). 

mode true(oo) is semidet . 
export constraint false (boolv) . 

mode false (oo) is semidet. 
true(X), false (X) <=> fail. 

: - export constraint 

and (boolv, boolv, boolv) . 
mode and(oo,oo,oo) is semidet. 
true(X) \ and(X,Y,Z) <=> Y = Z. 
true(Y) \ and(X,Y,Z) <=> X = Z. 
false(X) \ and(X,Y,Z) <=> false(Z). 

false(Y) \ and(X,Y,Z) <=> false(Z) . 

false(Z) \ and(X,Y,Z) <=> notboth(X,Y) . 

true(Z) \ and(X,Y,Z) <=> true(X), true(Y). 



In this case boolvs are simply variable indices^ and Boolean constraints and 
values are implemented using CHR constraints. Initialization simply builds a 
new term and increments the Boolean variable counter VNum which is a global 
variable. The constraint declaration is like a pred declaration except it in- 
dicates that it is a CHR predicate. The mode and determinism for each CHR 
constraint are defined as usual. The remaining parts are CHRs. The first rule 

® Somewhat simplified for ease of exposition. 

^ HAL does not yet support CHRs on Herbrand types 
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states that if a variable is given both truth values, true and false, we should fail. 
The next rule (for and/3) states that if the first argument is true we can replace 
the constraint by an equality of the remaining arguments. 

In HAL, CHR constraints must have a mode which does not change the 
instantiation of their arguments (like oo or in) to preserve mode safety, since the 
compiler is unlikely to statically determine when rules fire. Predicates appearing 
in the guard must also be det or semidet and not alter the instantiation of 
variables appearing in the left hand side of the CHR (this means they are implied 
by the store). This is a weak restriction since, typically, guards are simple tests. 



7 Evaluation 

Our first experiment has three aims. First, it illustrates the use of type classes 
for “plug and play” with solvers. Second, it determines the overhead of using 
type classes when implementing solvers. Third, it evaluates the efficiency of the 
generic solver writing constructs supported by HAL: dynamic scheduling and 
CHRs. For this experiment we created three implementations of a propagation- 
based Boolean constraint solver: using dynamic scheduling (dyn); using CHRs 
{chr)\ and using conversion to integer constraints We give two results for 

each HAL solver: solvt which uses type classes for separate compilation, where 
each query module was compiled separately from the solver, and joined at link 
time; and solvi where the query module imported the solver, and was compiled 
with this knowledge, so removing the overhead of type classes. It is important 
to note that type classes allowed us to use identical code for the benchmarks: 
only at linking time did we need to choose which solver to use. 

To evaluate the efficiency of HAL,® we also built comparable solvers in SIC- 
Stus Prolog: using the generic when delay mechanism (SlCSyj) closest to our 
generic delay mechanism, using the CHRs of SICStus (SICSc)', and using the 
clfd integer propagation solver (SICSz). Finally, for interest, we provide two 
more SICStus solvers: (SlCSt) a dynamic scheduling solver using the highly 
restricted but efficient block mechanism of SICStus, and (SlCSy) where the 
ground variable numbers in the CHR solver are replaced by Prolog variables, 
allowing the use of attribute variable indices. 

The comparison uses five simple Boolean benchmarks (most from [2]): the 
first pigeonn-m places n pigeons in m pigeon holes (the 24-24 query succeeds, 
while 8-7 fails); s churn Schurs’s lemma for n (see [2]) (the 13 query is the largest 
n that succeeds); queensn the Boolean version of this classic problem; mycien-m 
which colors a 5-colorable graph (taken from [16]) with n nodes and m edges 
with 4 colours; and fulladder which searches for a single faulty gate in a n bit 
adder (see e.g. [14] for the case of 1 bit). 

® More exactly, we use the integer propagation solver described in [7] which is imple- 
mented in C and interfaced to HAL using the methodology described in Section 5. 

® It is much easier to build highly flexible but inefficient mechanisms for defining 
solvers. 
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Table 1. Comparison of Boolean solvers for dynamic scheduling. 



Benchmark 


Var 


Con 


Search 


Dynamic Scheduling 


dynt 


dym 


SICS^ 


SICSb 


mycie23_71 


184 


583 


19717 


1855 


1816 


34769 


1920 


fulladder 


135 


413 


1046 


472 


471 


5181 


147 


pigeon24_24 


1152 


13896 


576 


805 


822 


2258 


56 


pigeon8_7 


112 


444 


24296 


931 


901 


16870 


843 


queens 18 


972 


13440 


42168 


8904 


8818 


125250 


7316 


schurlS 


178 


456 


57 


22 


18 


118 


4 


schurl4 


203 


525 


450 


98 


112 


1308 


63 



Table 2. Comparison of Boolean solvers using an existing integer solver and CHRs. 



Benchmark 


CHRs 


Integer 




chn 


chn 


SICSc 


SICSy 


intt 


inti 


SICS. 


mycie23_71 


25073 


25070 


200613 


76567 


1279 


1251 


5339 


fulladder 


4240 


4270 


24770 


13840 


178 


175 


313 


pigeon24_24 


71455 


70785 


107366 


71048 


81 


72 


957 


pigeon8_7 


11313 


11176 


61126 


36251 


765 


726 


3166 


queens 18 


504750 


511620 


1350636 


263433 


4522 


4438 


12363 


schurl3 


53 


63 


450 


201 


9 


7 


51 


schurl4 


823 


830 


5966 


2319 


55 


52 


278 



Table 1 gives an indication of how much work the solvers are performing for 
each benchmark. Var is the number of variables initialised by the solver, Con is 
the number of Boolean constraints, and Search is the number of labeling steps 
performed (using the default labeling strategy to find a first solution). Note 
that each solver implements exactly the same propagation strength on Boolean 
constraints and, thus, for each benchmark each different solver performs exactly 
the same search. All timings are the average over 10 runs on a dual Pentium 
II-400MHz with 384M of RAM running under Linux RedHat 5.2 with kernel 
version 2.2, and are given in milliseconds. SICStus Prolog 3.8.4 is run under 
compact code (no fastcode for Linux). 

From Table 1 it is clear that the generic delay mechanism implemented in 
HAL is reasonably efficient. In comparison with the propagation happening in 
C in the integer solver, the dynamic scheduled version is only 4 times slower. It 
also compares well with the generic dynamic scheduling of SICStus. However, 
the block based dynamic scheduling of SICStus illustrates how delay that is 
tightly tied to the execution mechanism can be very efficient. 

Table 2 shows that the CHR solver mechanism for HAL (at least for this 
example) is significantly faster than the SICStus equivalent, so much so that 
in this case even the use of attributed variable indexing does not regain the 
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Table 3. Executing CPLEX using the various HAL interfaces. 



Bench 


naive 


constants 


datastructures 


ECL 




Con 


inc 


Con 


inc 


batch 


Con 


inc 


batch 


+opt 


Con 


batch 


fib 


2557 


1329000 


465 


49010 


5950 


233 


25280 


2910 


2690 


232 


7650 


laplace 


347 


9070 


298 


5140 


1550 


20 


950 


210 


210 


90 


1070 


matmul 


nonlinear 


684 


142830 


15020 


216 


27390 


2950 


2270 


432 


10050 


mortgage 


nonlinear 


482 


89940 


18200 


2 


810 


750 


1520 


240 


6390 



difference except in the biggest examples. We are currently working on adding 
indices to HAL CHRs. 

Examination of both tables shows that the type class mechanism does not 
add substantial overhead to the use of constraint solvers: The overhead of type 
classes varies up to 3.5% (ignoring the 28% on very small times), and the average 
overhead is just 2%. 

Note that this experiment is not meant to be an indication of the merits of 
the different approaches since, for building different solvers, each approach has 
its place. 

Our second experiment compares the speed of the HAL interfaces defined in 
Section 5: the naive interface, the interface constant that keeps track of when 
cf loats are constants and solves assignments and test in the interface itself, and 
the interface datastructures which builds data structures to handle functions 
calls, and only sends constraints at predicate calls. For the last two, we run the 
solver incrementally (solving after every constraint addition) and in batch mode 
(explicitly calling a solve predicate). For the last interface we also provide a 
version (+opt) which implements a simple type of partial evaluation by making 
use of Mercury to aggresively inline predicates and functions even across module 
boundaries. Finally, we compare against the ECLiPSe [9] interface (ECL) to the 
CPLEX solver, which also batches constraints. 

The benchmarks are standard small linear arithmetic examples (see e.g. [3]). 
The table gives the number of constraints sent to CPLEX by each solver (Con), 
and execution times in milliseconds for 100 executions of the program. The 
last two benchmarks involve nonlinear constraints (not handled by CPLEX) if 
constants are not kept track of. 

It is clear from Table 3 that the naive interface is impractical. Tracking con- 
stants and performing assignments and tests in the interface itself significantly 
improves speed. The move to using data structures to build linear expressions 
is clearly important in practice. Using this technique, the number of constraints 
passed to CPLEX for mortgage is reduced to just 2. Finally, batching is clearly 
worthwhile in these examples. 

This experiment shows that the external solver interface for CPLEX is con- 
siderably faster than that provided by ECLiPSe and we believe that there is still 

Note that using bindings to represent true and false would result in a more efficient 
SICStus CHR Boolean solver, but the equivalent is not possible in HAL (at present). 
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considerable scope for improvement. Inlining of predicates and functions across 
module boundaries provides substantial improvement, but we believe that we 
can do even better by partially evaluating away many of the calls to solver in- 
terface and building the arguments to the calls to CPLEX at compile time if the 
constraint is known. Similarly, we would like to automatically perform “batch- 
ing” by making HAL introduce satisfiability checks just before a choice point 
is created. An important lesson from the second experiment is that it is vital 
for a CLP language to allow easy experimentation with the interface to external 
solvers, since the choice of interface can make a crucial difference to performance. 
Our experience with HAL has been very positive in this regard. 



References 

1. F. Bueno, M. Garcia de la Banda, M. Hermenegildo, K. Marriott, G. Puebla, and 
P.J. Stuckey. A model for inter-module analysis and optimizing compilation. In 
Procs of LOPSTR2000, volume 2042 of LNCS, pages 86-102, 2001. 

2. P. Godognet and D. Diaz. Boolean constraint solving using clp(FD). In Procs. of 
ILPS’1993, pages 525-539. MIT Press, 1993. 

3. B. Demoen, M. Garcia de la Banda, W. Harvey, K. Marriott, and P.J. Stuckey. An 
overview of HAL. In Procs. of PPCP’99, LNGS, pages 174-188, 1999. 

4. A.J. Fernandez and B.C. Ruiz Jimenez. Una semantica operacional para GProlog. 
In Proceedings of II Jornadas de Informdtica, pages 21-30, 1996. 

5. T. Friihwirth. GHR home page, www.informatik.uni-muenchen.de/~fruehwir/chr/. 

6. T. Friihwirth. Theory and practice of constraint handling rules. Journal of Logic 
Programming, 37:95-138, 1998. 

7. W. Harvey and P.J. Stuckey. Constraint representation for propagation. In 
Procs. of PPCP’98, LNGS, pages 235-249. Springer- Verlag, 1998. 

8. C. Holzbaur, P.J. Stuckey, M. Garcia de la Banda, and D. Jeffery. Optimizing 
compilation of constraint handling rules. In Procs. of ICLP17, LNCS, 2001. 

9. IC PARC. ECLiPSe prolog home page, http://www.icparc.ic.ac.uk/eclipse/. 

10. ILOG. CPLEX product page, http://www.ilog.com/products/cplex/. 

11. Jack: Java constraint kit. http://www.fast.de/ mandel/jack/. 

12. D. Jeffery, F. Henderson, and Z. Somogyi. Type classes in Mercury. Technical 
Report 98/13, University of Melbourne, Australia, 1998. 

13. S. Kaes. Parametric overloading in polymorphic programming languages. In 
ESOP’88 Programming Languages and Systems, volume 300 of LNCS, pages 131- 
141, 1988. 

14. K. Marriott and P.J. Stuckey. Programming with Constraints: an Introduction. 
MIT Press, 1998. 

15. Z. Somogyi, F. Henderson, and T. Conway. The execution algorithm of Mercury: 
an efficient purely declarative logic programming language. JLP, 29:17-64, 1996. 

16. M. Trick. mat.gsia.cmu.edu/COLOR/color.html. 

17. P. Wadler and S. Blott. How to make ad-hoc polymorphism less ad-hoc. In Proc. 
16th ACM POPL, pages 60-76, 1989. 




Practical Aspects for a Working Compile Time 
Garbage Collection System for Mercury 



Nancy Mazur^, Peter Ross^, Gerda Janssens^, and Maurice Bruynooghe^ 

^ Department of Computer Science, K.U. Leuven 
Celestijnenlaan, 200A, B-3001 Heverlee, Belgium 
{nancy , gerda, maurice}@cs .kuleuven. ac .be 
^ Mission Critical, Dreve Richelle, 161, Bat. N 
B-1410 Waterloo, Belgium 
petdrSmiscrit .be 



Abstract. Compile-time garbage collection (CTGC) is still a very un- 
common feature within compilers. In previous work we have developed 
a compile-time structure reuse system for Mercury, a logic program- 
ming language. This system indicates which datastructures can safely 
be reused at run-time. As preliminary experiments were promising, we 
have continued this work and have now a working and well performing 
near-to-ship CTGC-system built into the Melbourne Mercury Compiler 
(MMC). 

In this paper we present the multiple design decisions leading to this 
system, we report the results of using CTGC for a set of benchmarks, 
including a real-world program, and hnally we discuss further possible 
improvements. Benchmarks show substantial memory savings and a no- 
ticeable reduction in execution time. 



1 Introduction 

Modern programming languages typically limit the possibilities of the program- 
mer to manage memory directly. In such cases allocation and deallocation is 
delegated to the run-time system and its garbage collector, at the expense of 
possible run-time overhead. Declarative languages go even further by prohibit- 
ing destructive updates. This increases the run-time overhead considerably: new 
datastructures are created instead of updating existing ones, hence garbage col- 
lection will be needed more often. 

Special techniques have been developed to overcome this handicap and to 
improve the memory usage, both for logic programming languages [10,14,18] and 
functional languages [21,16]. Some of the approaches depend on a combination 
of special language constructs and analyses using unique objects [19,1,22], some 
are solely based on compiler analyses [13,16], and others combine it with special 
memory layout techniques [21]. In this work we develop a purely analysis based 
memory management system. 

Mercury, a modern logic programming language with declarations [19] profiles 
itself as a general purpose programming language for large industrial projects. 
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Memory requirements are therefore high. Hence we believe it is a useful research 
goal to develop a CTGC-system for this language. In addition, mastering it for 
Mercury should be a useful stepping stone for systems such as Ciao Prolog [12] 
(which has optional declarations and includes the impurities of Prolog) and 
HAL [8] (a Mercury-based constraint language). 

The intention of the CTGC-system is to discover at compile-time when data 
is not referenced anymore, and how it can best be reused. Mulkers et al. [18] 
have developed an analysis for Prolog which detects when memory cells become 
available for reuse. This analysis was first adapted to languages with declara- 
tions [3] and then refined for use in the presence of program modules [15]. A first 
prototype implementation was made to measure the potential of the analysis 
for detecting dead memory cells. As the results of the prototype were promis- 
ing [15], we have continued this work and implemented a full CTGC-system for 
the Melbourne Mercury Compiler (MMC), focusing on minimizing the memory 
usage of a program. In this paper we present the different design decisions that 
had to be taken to obtain noticeable memory savings, while remaining easy to 
implement within the MMC and with acceptable compilation overhead. A series 
of benchmarks are given, measuring not only the global effect of CTGC, but also 
the effect of the different decisions during the CTGC analysis. 

After presenting some background in Section 2, we first solve the problem of 
deciding how to perform reuse once it is known which cells might die (Section 3) . 
Section 4 presents low-level additions required to increase precision and speed, 
and obtain the first acceptable results for a set of benchmarks (Section 5). Using 
cell-caching (Section 6) more memory savings can be obtained. Finally improve- 
ments related to other work are suggested (Section 7), followed by a conclusion 
(Section 8). 

2 Background 

2.1 Mercury 

Mercury [11] is a logic programming language with types, modes and determin- 
ism declarations. Its type system is based on a polymorphic many-sorted logic 
and its mode-system does not allow partially instantiated datastructures. 

The analysis performed by our CTGC-system is at the level of the High 
Level Data Structure (HLDS) constructed by the MMC. Within this structure, 
predicates are normalized, i.e. all atoms appearing in the program have distinct 
variables as arguments, and all unifications X = Y are explicited as one of (1) a 
test X == Y (both are ground terms), (2) an assignment X := Y , {X is free, Y 
is ground) (3) a construction X z= f{Yi, . . . ,Yn) {X is free, all Yi are ground), 
or (4) a deconstruction X /(Yi,...,y„) {X is ground, all Yi are free) [11]. 
Within the HLDS, the atoms of a clause body are ordered such that the body 
is well moded. In the paper, we will use the explicit modes. 

Just like in the HLDS we will use the notion of a procedure, i.e. a combination 
of one predicate with one mode, and thus talk about the analysis of a procedure. 
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2.2 General Structure of the CTGC-System 

The CTGC-system consists of a data-flow analysis, followed by a reuse analysis 
and ended by a code generation pass (similar to [10]). 

The data-flow analysis is performed to obtain structure-sharing information 
(expressed as possible aliases [3]) and to detect when heap cells become dead 
and are therefore available for reuse. It is based on abstract interpretation [2] 
using a so called default call pattern for each of the procedures to be analysed. 
This default call pattern makes minimal realistic assumptions: the inputs to a 
procedure are in no way aliased, and only the outputs will be used after the call 
to the procedure. The data-flow analysis requires a flxpoint computation to deal 
with recursive predicate definitions. For more details, see [3,15]. 

Next the reuse analysis decides which reuses are possible (see Section 2.5). 
Different versions can then be created for the different reuses detected. While 
the underlying concepts were already developed in [15], the pragmatics of our 
implementation are discussed in this paper. 

Finally, low-level code corresponding to the detected reuses is generated. 

As Mercury allows programming with modules, the CTGC-system processes 
each module independently. Interface flies are used to allow analysis information 
(structure-sharing and reuse information) generated while processing one module 
to be used when processing other modules. 

2.3 Data Representation 

The purpose of the CTGC-system is to identify which objects on the heap, 
so called datastructures, become dead and can therefore be reused. In order 
to understand what these objects are, we will clarify the way typed terms are 
usually^ represented in the MMC. Consider the following types: 

:- type dir > north ; south ; east ; west. 

:- type example > a(int, dir) ; b (example) . 

Terms of primitive types such as integers, chars, floats^ and pointers to strings are 
represented as single machine words. Terms of types such as dir, in which every 
alternative is a constant are equivalent to enumerated types in other languages. 
Mercury represents them as consecutive integers starting from zero, and stores 
them in a single machine word. Terms of types such as examiple are stored on 
the heap. The pointer to the actual term on the heap is tagged [9]. This tag is 
used to indicate the function symbol of the term. Terms of types having more 
function symbols than a single tag can distinguish use secondary tags. 

Figure 1 shows the representation of a variable A bound to b(a(3,east)). 
In this paper hal, byl,. . .denote heap cells, whereas sa, sx,. . . are registers or 
stack locations. 

^ The MMC compiles to different back-ends, the most common being ANSI-C. Higher- 
level back-ends, such as Java or .NET, use different low level representations, yet 
the theory of recycling heap cells remains the same. 

^ Depending on the word-size, these might have a boxed representation. 
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pred convert 1 (example , example), 
mode convertKin, out) is semidet . 
convertl(X,Y) X => b(Xl) , 

XI => a(Al, _) , 

Y1 <= a(Al, north), 

Y <= b(Yl) . 



Fig. 1. A = b(a(3 .east) ) . 



Fig. 2. C onversion- procedure. 




Fig. 3. No reuse 



Fig. 4. Reuse 



2.4 Data Reuse 

Figure 3 shows the memory layout when calling convertKA, B) (Fig. 2), where 
A is bound to b(a(3,east) (Fig. 1). 

After deconstructing the input, new heap cells (hyl, hy2 and hy3) are allo- 
cated to create Y, and the content of X is partially copied into those cells. If it can 
be shown at compile-time that after this procedure call the term pointed at by 
X will not be referenced during the rest of the program (thus becoming available 
for reuse), then the deconstruction statements perform the last access ever to 
the concerned heap cells (hal, ha2, ha3) after which they become garbage, and 
can be (re)used for Y (Fig. 4, the contents of sx and sxl are no longer relevant). 

The optimization could go further and detect that this reuse only requires 
the update of one heap cell, namely ha3. Yet currently we mainly focus on the 
memory usage of a program, execution time being only of indirect importance. 
Therefore we do not try to optimize the number of field updates in the presence 
of reuse. See also Section 7. 



2.5 Types of Reuse 

During reuse analysis we make a distinction between different kinds of reuses [15] . 
The procedure shown in Fig. 2 has direct reuse (of X): it might contain the last 
reference to the cells of X which can then be reused by Y. However, the reuse is 
conditional: if the caller’s environment does not correspond to the default call 
pattern (e.g. keeping a reference to one of the cells marked for reuse), reuse is 
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type fieldl > fieldKint, int, int) . 

type field2 > field2(int, int). 

type list(T) > [] ; [T I list(T)]. 

pred convert2(list(fieldl) , list (f ield2) ) . 
mode convert2(in, out) is det . 

convert2(ListO, List):- 
( y, switch on ListO 



ListO => [] , List <= [] 




ListO => [Fieldl I RestO] , 


y. (di) 


Fieldl => fieldKA, B, _C) , 


y. (d2) 


Field2 <= field2(A, B) , 


y. (cl) 


convert2 (RestO, Rest) , 




List <= [Field2 I Rest] 


y. (c 2 ) 



). 

Fig. 5. Converting lists. 

not allowed. As this imposes very harsh restrictions on the reuse possibilities, we 
introduced the notion of reuse conditions which express the minimal conditions 
a call pattern has to meet so that reuse is safe. These conditions are expressed in 
terms of the variables involved (here X) . If the reuse is independent of the calling 
environment (X being a local variable), then we have unconditional reuse. 

Given the reuse conditions, the next step of the reuse analysis is to check for 
indirect reuses. Consider the following procedure: 

pred generate (examiple) . 
mode generate (out) is semidet . 
generate(Y) : - generate_2(X) , convertKX, Y) . 

Assuming the default call pattern for generate, the call to convert 1 meets the 
condition that after that call X will not be used anymore. Hence, a reuse- version 
of convertl can be called and we say that generate has indirect reuse. Moreover, 
X is a local variable, so reusing it will always be safe as it is independent of the 
call pattern to generate. This is an example of unconditional (indirect) reuse. 
If X would have been an input variable to generate, the indirect reuse would be 
conditional and additional reuse conditions would be formulated. 

3 A Working Reuse Decision Approach 

Consider the predicate in Fig. 5 which converts a list of data-elements into a new 
list of data-elements. While the data-flow analysis spots the datastructures that 
can potentially be reused, it is up to the reuse analysis to select those reuses 
(direct and indirect) that yield the most interesting saving w.r.t. memory usage 
(and indirectly execution time). 
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3.1 Deciding Direct Reuse 

A first restriction we impose is to limit reuses to local reuses, i.e. a dead cell can 
only be reused in the same procedure as where it is last accessed (deconstructed). 
In Section 6 we discuss techniques of how to lift this restriction. Furthermore, we 
consider that dead structures can only be reused by at most one new structure. 
Using the terminology of Debray [7], we limit ourselves to the simple reuse 
problem. It is not difficult to remove this limitation, but it makes the reuse 
decisions more complex. We plan to lift this restriction in the future. 

The data-flow analysis of the example identifies the deconstructed datastruc- 
tures (at dl, resp. d2) as available for reuse. The procedure also contains two 
constructions (cl and c2) where the memory from the dead cells could be reused. 

Each of the combinations yields an acceptable reuse-scheme. Yet, which one 
is the most interesting? It has been shown that this problem [7] can be reformu- 
lated as an instance of the maximum weight matching problem for a weighted 
bipartite graph. However for simplicity of implementation we have reduced this 
general matching problem to two orthogonal decisions: imposing constraints on 
the allowed reuses, and using simple strategies to select amongst different can- 
didates for reuse. We will discuss each of these. 

Constraints on allowed reuses. Constraints allow one to express common char- 
acteristics between the dead and the newly constructed cell and reflect the re- 
strictions which can be imposed by the back-end to which a Mercury program 
is compiled. 

We have implemented the following constraints: 

— Almost matching arities. This constraint expresses the intuition that it can 
be worthwhile to reuse a dead cell, even if not all memory-words are reused. 
This is indeed interesting if it can be guaranteed that the superfluous words 
will be collected by the run-time garbage collector within a reasonable delay. 
In our example, allowing a difference of size one allows cl and c2 to reuse 
the memory available from either dl or d2. 

— Matching arities. If the run-time system is not powerful enough to be used 
with almost matching arities, then a more restrictive constraint can be used: 
only allow reuse between constructors having the same arity. This means 
that in our example only dl can be reused (by either cl or c2). 

— Label-preserving. Using the Java or .NET back-end, it is not possible to 
change the type of run-time objects, therefore reuse is only allowed if the 
dead and new cell have the same constructor (label). For the example this 
means that the cell from dl can only be reused in c2. 

Selection strategies. When a cell can reuse different dead cells, a choice has 
to be made (e.g. cl can either reuse the cell available from dl or d2). Some 
choices yield better results than others. We have experimented with two simple 
strategies: 

— Lifo. Traverse the body of the procedure and assign the reuses using a last- 
in-first-out selection strategy. This means that when a choice is left for a 
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given construction, choose the cell which died most recently. The intuition is 
that after deconstructing a variable, it is very likely that a new similar cell 
will be constructed in the same context. 

e.g. If cl is allowed to reuse the cells from dl or d2, then according to this 
strategy, Fieldl will be reused for constructing Field2 and ListO for List. 
— Random. The intuition behind the lifo-strategy might not always be true, for 
example in the presence of a disjunction^. Therefore we have added a simple 
strategy which randomly selects the dead cell amongst all the candidates. 



3.2 Deciding Indirect Reuse 

In order to decide whether a call to a procedure can be substituted by a call 
to a reuse version of that procedure, we must be sure that such substitution is 
safe. This is tested by checking the reuse-conditions (under the assumption of 
a default call pattern). If it is safe to call the reuse- version we have to decide 
whether we will do so or not. 

Here we have decided for simplicity by always calling the reuse-version of a 
procedure if it is safe to do so. In Section 7 we discuss the drawbacks and suggest 
a possible better solution. 

Suppose that for our previous example we would only allow the reuse of 
the list-cells (dl by c2). Such reuse is conditional: the list cell only dies iff it 
is not needed within the caller’s context. This condition has to be checked for 
the recursive call. Under the default call pattern (see Section 2.4) RestO is dead 
at the moment of the recursive call, hence the condition is satisfied, and the 
recursive call can safely be substituted by a call to its reuse version^. 



3.3 Splitting into Different Versions 

Once the possible direct and indirect reuses have been decided, there is one 
remaining decision left: how many versions of a given procedure should be cre- 
ated? In our example, we might have detected three reuses: ListO reused by 
List, Fieldl by Field2, and the indirect reuse (the recursive call to the reuse 
version). We can generate 4 interesting versions of the initial procedure: a ver- 
sion with no reuse, a version reusing only ListO, a version reusing Fieldl and 
a version reusing both (where the reuse versions also include the recursive reuse 
call). In general, for a procedure with n possible direct reuses, 2” interesting 
versions can be created. In our implementation we limit the number of versions 
to at most two: a version which imposes no conditions on the caller (containing 
all possible unconditional reuses), and a version containing all detected reuses. 
In Section 7 we briefly discuss other possibilities. 

® e.g. X=>f(..), ( ... Y<=f(..) ; ... ), Z<=f(..):as the first branch of 
the disjunction might not always be executed, it is more interesting to allow Z to 
reuse X than Y. 

^ Note that this indirect reuse is in itself conditional: it can only be allowed if the 
list-cells of RestO are not needed in the callers context. 
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4 Low Level Additions 

Given the previous decisions, a first CTGC-system was implemented. Although 
good results were obtained for small programs (e.g. naive-reverse), we ran into 
problems when analysing large ones: 

— imprecision in the alias analysis had the effect that relatively few cells were 
recognized as dead. 

— the number of aliases collected within a procedure became huge. This slowed 
down the operations manipulating them and the GTGG process became too 
time consuming. 



4.1 Enhancing the Aliasing Precision 

The underlying analysis for deriving alias-information uses the concept of top 
which expresses that all data parts might be aliased. This is a safe abstraction 
in the case of total lack of knowledge about the possible existing aliases at some 
program point. Once generated, this lack of information propagates rapidly as 
all primitive operations manipulating it yield top as well. 

Such a top is generated in the presence of language constructs with which the 
analysis cannot cope yet. These are procedures defined in terms of foreign code 
(c, G-l— k), higher-order calls and typeclasses. It is also generated for procedures 
which are defined in other modules that have not yet been analysed and for 
which no interface files have been generated yet. 

To obtain a usable GTGG-system, techniques were needed to limit the cre- 
ation and propagation of top. In our implementation, three techniques are used: 

1. Using heuristics. Based on the type- and mode- declaration of a procedure, 
one can derive whether it can create additional aliases or not, without looking 
at the procedure’s body. This is the case when a procedure uses unique 
objects (declared di or uo [11]), or only has unique output variables® or 
when the non-unique output arguments are of a type for which sharing is 
not possible (integers, enums, chars, etc.). In all these cases, it is safe to 
conclude that the procedure will not introduce new aliases. 

2. Manual aliasing annotation for foreign code. Important parts of the Mercury 
standard library consist of procedures which are defined in terms of foreign 
code. With the intention to be used mainly in this standard library, we have 
extended the Mercury language such that foreign code can be manually 
annotated with aliasing-information. 

3. Manual iteration for mutual dependent modules. The current compilation- 
scheme of Mercury is not yet able to cope with mutual dependent modules. 
Gonsider a module A in which some procedures are expressed in terms of 
procedures declared in a module B, and vice versa. The normal compilation 
scheme is to compile one of the files, and then the other one. In the presence 

® A procedure call cannot create additional aliases between input variables as they 
must be ground at the moment when the procedure is called. 
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of an optimizing compiler this is not enough. At the moment the first module 
is compiled, nothing is known from the second one, yielding bad precision 
for the first one. This bad precision will propagate further to the second 
file as the second file relies on the first one. Bueno et al. [5] propose a new 
compilation scheme which is able to handle these cases. As this requires 
quite some work, we make a work around by allowing manually controlled 
incremental compilation. 



4.2 Making Compilation Faster: Widening the Aliasing 

While it is interesting to have more precise aliasing information than simply top, 
having more aliases also slows down the system. Now one can argue that speed is 
not a major requirement of a CTGC-system as it is primarely intended to be used 
only at the final compilation phase of a program, but even for our benchmarks 
we were not ready to wait hours for a module to compile. Therefore, in order to 
produce a usable CTGC-system we have added a widening operator [6] which 
acts upon the aliases produced®. 

During the data-flow analysis, a datastructure is represented by its full path 
down the term it is part of. Such a path is a concatenation of selectors which 
selects the functor and the exact argument position in the functor^. Aliases are 
expressed as pairs of datastructures. 

To illustrate this, let us consider the following definition of a tree type: 

type tree > e ; two (int, tree, tree) 

; three (int, int, tree, tree, tree) . 

After the construction V <= three (2, 3, two (0, e, e) , A, A) (where A is a variable 
bound to another tree-term), the path {three, 3) ■ {two, 1) selects in V the zero- 
integer. The path {three, 3) selects in V the whole datastructure corresponding 
to the first subtree (namely two(0,e,e)). In V, the positions corresponding with 
the paths {three, 4) and {three, 5) are aliased. 

For the aliasing information, we introduced type widening that consists of 
replacing a full path of normal selectors by one selector, a so called type se- 
lector. The meaning of a type selector is as follows: instead of selecting one 
specific subterm of a term, it will select all the subterms which have the type 
expressed by the selector. In our example, the paths {three, 1), {three, 2), and 
{three, 3) • {two, 1) all select integer elements of V. With type widening, all these 
selectors are reduced to the selector {int), i.e. the type of the subterms which 
they select. The alias in V (ie. between {three, A) and {three, 5)) becomes an 
alias between {tree) and {tree), hence expressing that all subtrees of V might be 
aliased. If other aliases between subtrees of V exist, then they will all be replaced 
by this one single alias, hence making the overall size of the set of aliases smaller. 

® This widening operator can be enabled on a per-modnle base. The nser can also 
specify the threshold at which widening should be performed: e.g. only widen if the 
size of the set of aliases exceeds 1000. 

^ Infinite paths are avoided by simplifying full type trees to type graphs. This is beyond 
the scope of this paper. 
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This widening leads to a considerable speed-up of the CTGC-system (com- 
pilation of some modules taking almost one hour was now reduced to less than 
a minute). Our results suggest that the overall precision remains sufficient in 
order to detect the expected reuses for our benchmarks. 

5 First Results 

We have evaluated the effectiveness of our CTGC-system by comparing mem- 
ory usage and measuring compilation times. We have used toy benchmarks and 
one real-life program. All the experiments were run on an Intel-Pentium III 
(600Mhz) with 256MB RAM, using Debian Linux 2.3.99, under a usual work- 
load. The CTGC-system was integrated into version 0.9.1 of the MMC. The 
reported memory information is obtained using the MMC memory profiler. This 
profiler counts the total number of memory words that are allocated on the 
heap®. The timings are averages of 10 runs each time. All the benchmarks are 
compiled using a non-optimized Mercury standard library w.r.t. memory usage 
(hence no reuse in the library predicates®). This allows us to focus on the reuse 
occurring in the actual code of the benchmarks. 

The toy benchmarks comprise nrev (naive reverse of a list of 3000 integers), 
qsort (quick sort of a sorted list of 10000 integers), and argo-cnters (a benchmark 
counting various properties of a file, also used in [15]). Table 1 shows the results. 
These are independent of the CTGC configuration used, as they all yield the 
same results here. For each of the benchmarks every possible reuse is detected, 
yielding the expected savings in memory usage and execution time. 



Table 1. Toy benchmarks. C = compilation time. M = number of allocated words. R 
= execution time, m = relative reduction in memory usage. 



module 


[ No Reuse 


1 Reuse 


C (sec) 


M (Word) 


R (sec) 


C (sec) 


M (Word) 


m (%) 


R (sec) 


nrev 


1.49 


9M 


1.51 


11.79 


6k 


-99.9 


0.32 


qsort 


1.40 


SOM 


36.63 


11.29 


20k 


-99.9 


27.22 


argo_cnters 


4.53 


3.00M 


0.35 


16.38 


2.60M 


-13.3 


0.32 



Next to small benchmarks, we found it important to evaluate the system on a 
real-life program, where the different constraints and strategies do make a differ- 
ence. The program we used is a ray tracer program developed for the ICFP’2000 
programming contest [17] where it ended up fourth. This program transforms a 
given scene description into a rendered image. It is a CPU- and memory-intensive 
process, and therefore an ideal candidate for our CTGC-system to be tested on. 
A complete description of this program can be found at [20] . 

® Note that this count is independent of any run-time garbage collection. 

® Normally, a Mercury system with CTGC would also have the library modules com- 
piled with CTGC in the same way as user modules. 
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The program consists of 20 modules (5700 lines of code), containing mostly 
deterministic predicates. All modules could be compiled without widening, ex- 
cept for one: peephole. This module manipulates complex constructors and gen- 
erates up to IIK aliases. Without type- widening, the compilation of peephole 
takes 160 minutes. With type-widening (at 500 aliases), it only takes 40 seconds. 
The compilation of the whole program with CTGC (and widening) takes 5 min- 
utes, compared to 1 minute for a normal compilation. As some of these modules 
depend on each other, the technique of manually iterating the compilation was 
used to obtain better results. For this benchmark, the compilation had to be 
repeated 3 times to reach a fixpoint (for a total time of 15 minutes) . Each time 
every module was recompiled. In a smart compilation environment, most of the 
recompilations could be avoided. 

To measure the effects of the different constraints and strategies we have com- 
piled the ray tracer with different CTGC-configurations. The first row of Table 2 
shows the number of memory words and the execution time (in seconds) needed 
to render a set of 27 different scene descriptions (ranging from simple scenes, to 
more complex ones) using a version of the ray tracer without GTGG. Rows 1 to 
9 show the relative memory usage and execution time of ray tracers compiled 
using different GTGG-configurations for the same set of scene descriptions: 

— Using the matching arities (match) or label-preserving (same cons) con- 
straints, up to 24% memory can be saved globally. For some scene descrip- 
tions, this can go up to 30%. There is also a noticeable speedup (14%). 

— Using almost matching arities within a distance of one (within 1) or two 
(within 2), much less memory is saved (only 10%) with hardly any speedup. 
The bad memory usage is not surprising as none of the selection strate- 
gies takes into account the correspondance of the arities between a new 
cell and the available dead cells. The bad timings are also explicable: with 
non-matching arities, reuse leaves space-leaks which cannot immediately be 
detected by the current run-time garbage collector, hence the garbage col- 
lector will be called more often. Improvements to the garbage collector are 
required. 

— Globally, using the random selection strategy yields slightly worse results 
than lifo. For some scene descriptions though, results are better, but without 
spectacular differences. 

— Row 9 shows the results of a ray tracer compiled using a version of the 
Mercury standard library with GTGG. There is hardly any difference with 
Row 1, where libraries were used without GTGG. This is due to the fact that 
the ray tracer makes a limited use of these libraries. 

Finally, a version of the ray tracer was built without type- widening (lifo 
and matching arities). Gompared to row 1 in Table 2 the overall memory usage 
difference is less than 1%. The execution times are comparable. 
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Table 2. ICFP-ray tracer using different CTGC-configurations. 





Configuration 


Memory 
(kWord) (%) 


Time 

(sec) (%) 


0 


no CTGC 


1024795.51 


- 


362.31 


- 


1 


lifo 


match 




776707.92 


-24.21 


311.85 


-13.93 


2 


lifo 


same cons 




791742.06 


-22.74 


313.57 


-13.45 


3 


lifo 


within 1 




916642.90 


-10.55 


361.84 


-0.13 


4 


lifo 


within 2 




917847.97 


-10.44 


359.90 


-0.67 


5 


random 


match 




780838.58 


-23.81 


310.75 


-14.23 


6 


random 


same cons 




795872.67 


-22.34 


312.70 


-13.69 


7 


random 


within 1 




920764.26 


-10.15 


359.14 


-0.87 


8 


random 


within 2 




921969.35 


-10.03 


355.08 


-2.00 


9 


lifo 


match 


libs 


775607.04 


-24.32 


320.32 


-11.59 


10 


lifo 


match 


cc 


513901.37 


-49.85 


301.66 


-16.74 


11 


lifo 


same cons 


cc 


542626.80 


-47.05 


304.20 


-16.04 


12 


lifo 


within 1 


cc 


845603.55 


-17.49 


375.79 


3.72 


13 


lifo 


within 2 


cc 


864722.90 


-15.62 


370.49 


2.26 


14 


random 


match 


cc 


518032.04 


-49.45 


299.45 


-17.35 


15 


random 


same cons 


cc 


546757.48 


-46.65 


302.79 


-16.43 


16 


random 


within 1 


cc 


849724.90 


-17.08 


363.90 


0.44 


17 


random 


within 2 


cc 


868844.29 


-15.22 


391.68 


8.11 



6 Non-local Reuse: Cell Cache 

Currently we have assumed that all dying datastructures must be reused locally, 
i.e. within the same procedure in which they die. Hence quite some interesting 
possibilities of reuse could be missed. 

We see three ways to achieve non-local reuses as well. The first and the 
most difficult is to extend the data-flow analysis to handle non-local reuse. The 
analysis would have to propagate possible dead cells and thus become quite 
complex. It would also require intensive changes in the internal calling convention 
of procedures within the MMC as the address of the cells to be reused would 
have to be passed between procedures. The second approach is to combine reuse 
analysis with inlining in such a way that the cell death and subsequent reuse end 
up in the same procedure. The third approach, which is the one we implemented, 
is to cache dead cells. Whenever a cell dies unconditionally and cannot be reused 
locally, we mark it as cacheahle. At runtime the address of the cell as well as 
its size will be recorded in a cache (or free list). Before each memory allocation 
the runtime system will first check the cell cache to see if a cell of the correct 
size is available and use that cell instead of allocating a new cell. This operation 
increases the time taken to allocate a memory cell in the case of the cell cache 
being empty, and hence should only be a win if the cell cache occupancy rate 
is high. It also avoids new allocations so the overall cost of the runtime garbage 
collection system should go down due to smaller heap sizes and less frequent 
need for garbage collection. 





Working Compile Time Garbage Collection System for Mercury 117 



The cc-entries of Table 2 (Rows 10-17) show the results of CTGC-configu- 
rations combined with the cell cache technique. Compared to the basic CTGC- 
configurations, cell caching always increases memory savings, going up to 49% 
(for some scenes even 70%). In the case of label-preserving or matching ari- 
ties constraints, execution time drops slightly. On the other hand, using almost 
matching arities combined with cell caching increases the execution time. 



7 Further Improvements 

In the near future, we intend to explore a number of improvements to our system. 
First, for some procedures, several possibilities of reuse are discovered, each one 
imposing its own reuse conditions. Taken together, these reuse conditions are 
too restrictive on the caller, hence hardly any calling environment is able to 
satisfy them, and no reuse is performed at all. A top-down call- dependent version 
splitting pass could aid in generating more useful reuse-versions of procedures, 
and avoid the generation of the useless ones. 

A second problem is the too absorbant effect of the notion of top currently 
used in the alias information. Once top is encountered, it propagates all through- 
out the remainder of the code. Instead of top, we could use topmost substitu- 
tions [4]: e.g. generating all possible combinations of aliases between the argu- 
ments of a called predicate, based on the types of these arguments, either explic- 
itly or in a more compact form (using type-selectors or keeping sets of variables, 
stating that these variables might be aliased to each other in any possible way). 

In this paper we mainly focussed on memory savings, reasoning that saving 
memory implies less garbage collection, hence diminishes the execution time. If 
execution time is of primary concern, than more sophisticated reuse strategies 
will be needed. In the near future we will adopt the use of weighted graphs [7] 
where the weights can be adjusted for minimizing memory usage or execution 
time (taking into account the fields that do not need to be updated). We will 
also consider splitting dead cells and reusing them for different new cells. 

In [10] the focus on execution time is even greater, trying to discover almost 
every field not requiring an update, going even beyond the boundaries of single 
procedures. This is indeed important in Prolog, where the determinism of proce- 
dures is not necessarely known at analysis time, and where given the underlying 
data-flow analysis, each cell update requires extra care in the case the value has 
to be reset upon backtracking. In Mercury, where determinism is known at com- 
pilation time, and where the analysis explicitly takes into account backtracking, 
this is not a major issue. Therefore, it is not our immediate intention to try to 
avoid every possible cell update. 



8 Conclusion 

This paper describes a complete working compile-time garbage collection sys- 
tem for Mercury, a logic programming language with declarations. The system 
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consists of three passes: data-flow analysis, reuse decision, and low level code gen- 
eration. The data-flow analysis based on [15] detects which cells become available 
for reuse. This paper presents easy implementable restrictions, constraints and 
strategies for selecting realistic reuses. In order to obtain a workable CTGC- 
system, low level improvements were introduced. 

A major contribution of this work is the integration of the CTGC system in 
the Melbourne Mercury Gompiler and its evaluation. Some small benchmarks 
were used, but also one real-life complex program, a ray tracer. Average global 
memory savings of up to 49% were obtained, with a speedup of up to 17%. It 
would be interesting to compare these results with the total potential of reuse 
within the program. This total potential could be approximated using the tech- 
niques used in our first prototype [15] to predict the amount of reuse. 

Beside the proposed improvements the system could also be adapted to han- 
dle higher order calls and type classes properly (instead of generating top alias- 
ing, and not allowing reuse) . Yet given the fact that many higher order calls are 
specialized away by the compiler, we currently do not believe that the overhead 
needed to deal with these language constructs is worthwhile. 
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Abstract. Boolean functions are ubiquitous in the analysis of (con- 
straint) logic programs. The domain of positive Boolean functions, Pos, 
has been used for expressing, for example, groundness, finiteness and 
sharing dependencies. The performance of an analyser based on Boolean 
functions is critically dependent on the way in which the functions are 
represented. This paper discusses multiheaded clauses as a representa- 
tion of positive Boolean functions. The domain operations for multi- 
headed clauses are conceptually simple and can be implemented straight- 
forwardly in Prolog. Moreover these operations generalise those for the 
less algorithmically complex operations of propositional Horn clauses, 
leading to naturally stratified algorithms. The multiheaded clause repre- 
sentation is used to build a pos-based groundness analyser. The analyser 
performs surprisingly well and scales smoothly, not requiring widening 
to analyse any program in the benchmark suite. 

Keywords. Abstract interpretation, (constraint) logic programs. Boolean 
functions, groundness analysis. 



1 Introduction 

Boolean functions play an important role in the practice of static analysis. Many 
analyses are couched in terms of Boolean functions, and manipulation of these 
functions is crucial to the performance of any implementation. In particular, 
positive Boolean functions have been applied to the analysis of logic programs 
for properties such as groundness, rigidity [15], finiteness [3] and sharing [8]. This 
paper advocates representing positive Boolean functions as multiheaded clauses 
and argues that Prolog is well suited to their manipulation. 

The choice of abstract domain for a particular application involves the strik- 
ing of a balance between efficiency and precision. The various properties tracked 
using positive Boolean functions give rise in practice to different forms of Boolean 
function. Hence, in some applications, restricting to a more computationally 
tractable subclass of Pos can have a significant impact on precision (for example, 
goal-independent analysis of library code) , whilst in others little precision is lost 
(for example, goal-dependent groundness analysis). Elsewhere, the authors have 
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discussed various subclasses of Pos and their computational properties [17,19]. 
Here, with an eye to a wider range of applications, the authors adapt techniques 
from these subclasses to Pos. 

Traditionally, Boolean function manipulation has been performed using bi- 
nary decision diagrams (BDDs). Groundness analysis is one of the most im- 
portant topics in the static analysis of (constraint) logic programs and from 
a logic programming point of view this analysis is the most practical test of 
Boolean function manipulation. BDD-based analysers have consistently outper- 
formed those based on other representations of Boolean functions [1,2,10,24] 
for groundness analysis, but there has been a continuous stream of work on 
representations amenable to Prolog implementation [7], in particular for the 
subclass of definite positive functions, Def [12,13,19]. The majority of these im- 
plementations, included those based on BDDs, require widening to analyse large 
benchmarks. 

The Def-based groundness analyser described in [19] does not require widen- 
ing and was designed so that the most frequently called domain operations are 
the most lightweight. The same design methodology suggests that a Pos-based 
analyser should represent Boolean functions as conjunctions of multiheaded 
clauses. In fact, in [1] (reduced) conjunctive normal form, (R)CNF, was investi- 
gated, and “performed reasonably well”, but was ultimately rejected since BDDs 
performed 40% faster and, in C (their implementation language), conjunctive 
normal form is no easier to code than BDDs. Surprisingly, conjunctive normal 
forms have not been considered since. This paper revisits clausal representations 
of Pos since, in Prolog, clausal representations are much easier to code than 
BDDs and following the methodology of [19] the clausal representation lends 
itself to efficient implementation based on entailment checking. 

The importance of the choice of representation is clearly illustrated by the 
subtle difference between multiheaded clauses and RCNF. The RCNF represen- 
tation is reduced in the sense that no clause subsumes another. This reduction 
makes meet for RCNF quadratic in the size of the representation. The mul- 
tiheaded clause representation may contain redundant clauses, enabling meet 
to be constant time. This is an important issue for performance since meet is 
by far the most frequently applied operation. Neither multiheaded clause nor 
RCNF representations are in a canonical form, therefore equivalence cannot be 
detected by straightforward syntactic identity. In [1] equivalence for RCNF is 
determined by computing the dual Blake canonical form of the formulae and 
then testing for syntactic identity. The dual Blake canonical form may be expo- 
nentially larger than the RCNF representation and must always be completely 
computed. Therefore the method is not amenable to filtering through lower com- 
plexity algorithms. Logical entailment, rather than syntactic equivalence, is more 
flexible. In practice, entailment of formulae can often be detected using an in- 
complete low complexity algorithm. Using such a check, many calls to the worst 
case algorithm can be filtered out. It is this stratified use of entailment checking 
that enables an analyser based on multiheaded clauses to scale surprisingly well. 
Speed is achieved by exploiting Prolog technology - by using a nonground rep- 
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resentation entailment checking can be implemented efficiently using renaming 
and block declarations, whilst meet reduces to list concatenation implemented 
using difference lists. The major themes and contributions of this work are: 

• Pos functions can be naturally expressed as multiheaded clauses, which are 
particularly straightforward to understand, manipulate and code. 

• The entailment checking algorithm (which is potentially exponential in the 
number of variables) is stratified so that checks for naturally occurring sub- 
classes of formulae take quadratic time (in the size of the formulae); in par- 
ticular the forward chaining algorithm for propositional Horn clauses is sub- 
sumed. 

• The domain operations for multiheaded clauses may be coded succinctly and 
efficiently in Prolog, resulting in fast Pos-based goal-dependent and goal- 
independent groundness analysers which do not require widening for any 
program in the benchmark suite. 

• If widening is required, the representation may be simply and naturally 
widened to Def or to the simpler domain EPos. 

• The analysers again demonstrate the value of a principled approach to the 
design of a static analysis. 

• An experimental evaluation of the analysers is given illustrating that a 
clausal representation of Pos coded in Prolog gives performances comparable 
to HDD representations coded in C. 

The rest of this paper is structured as follows: Section 2 introduces the neces- 
sary technical background material. Section 3 details multiheaded clauses. Sec- 
tion 4 gives algorithms for the abstract operations of Pos represented as multi- 
headed clauses. Section 5 describes Pos-based groundness analysers implemented 
with Boolean functions represented as multiheaded clauses. Section 6 gives an 
experimental evaluation of these analysers. Section 7 reviews related work and 
Section 8 concludes. 

2 Preliminaries 

A Boolean function is a function / : Bool" — ^ Bool where n > 0. Let V denote 
a denumerable universe of variables. A Boolean function can be represented by 
a propositional formula over X Q V where \X\ = n. The set of propositional 
formulae over X is denoted by Booljc . Throughout this paper. Boolean functions 
and propositional formulae are used interchangeably without worrying about 
the distinction. The convention of identifying a truth assignment with the set of 
variables M that it maps to true is also followed. Specifically, a map ^x(Af) : 
p{X) — >■ Booix is introduced defined by: V'x(AI) = (AM) A -i(V(A\M)). In 
addition, the formula AY is often abbreviated as Y. 

Definition 1. The map modelx ■ Booix — >■ p(p(X)) is defined by: modelxif) 
= {M C X I t/>x(M) \= /}. Also, countermodel X '■ Booix — >■ p{p{^)) is defined 
by: countermodel X if) = p{p{X))\modelx{f)- Observe that modelx is bijective, 
hence modelx^ : p{p{X)) Booix is well defined. 
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Fig. 1. Basse diagrams 



Example 1. If X = {x, y}, then the fwaction {{true, true) true, {true, false) 
false, {false, true) false, {false, false) false} can be represented by the 
formula x Ay. Also, modelx{x Ay) = {{x, y}} and modelx{x V y) = {{x}, {y}, 

{x,y}}- 

The focus of this paper is on the use of subclasses of Booljf in tracing de- 
pendencies. These subclasses are defined below: 

Definition 2. A function / is positive iff A € model x{f)- Posx is the set 
of positive Boolean functions over X. A function / is definite iff M fl M' G 
modelxif) for all M,M' G modelxif). Defjc is the set of positive functions 
over X that are definite. A function / is GE iff / is definite positive and for all 
M,M' G modelyar(f){f), |M\M'| yf 1. EPosx is the set of GE functions over X. 

Note that EPosx C Defx C Posx. Also notice that EPosx = [AF \ F C 
X U Ex}, where Ex = {x gg y \ x,y G X}. 

Example 2. Suppose X = {x, y, z} and consider the following table, which states, 
for some Boolean functions, whether they are in EPosx, Defx or Posx and also 
gives model X- 



/ 


EPosx Defx Posx 


modelxif) 


false 








0 


X Ay 


• 


• • 


{ 


{x,y}, {x,y,z}} 


xV y 




• 


{ {xhivh 


{x,y}, {x,z}, {y,z}, {x,y,z}} 


x<-y 




• • 


{0, {x}, 


{z},{x,y},{x,z}, {x,y,z}} 


xV (y<- z) 




• 


{0, {x}, {y}, 


{x,y}, {x,z}, {y,z}, {x,y,z}} 


true 


• 


• • 


{0, {x}, {y}, {z}, {x,y}, {x,z}, {y,z}, {x,y,z}} 



Note that x V y is not in Defx (since its set of models is not closed under 
intersection) and that false is neither in EPosx, nor Posx, nor Defx. 

The 4-tuple (Posx, |=> V) is a finite lattice, where true is the top ele- 
ment and AX is the bottom element. The set of (free) variables in a syn- 
tactic object o is denoted by var(o). Existential quantification is defined by 
Schroder’s Elimination Principle, that is, 3x.f = /[x i— >■ true] V /[x i— >■ false]. 
Also, . . . ,yn}-f (project out) abbreviates 3yi 3y„./ and 3Y.f (project 
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onto) denotes 3var{f) \ Y.f. Two functions /, /' are equivalent, / = /' if 
and only if / ^ /' and /' \= f. Finally, for any / G Booljf, coneg{f) = 
model {{X \ M \ M G modelxif)})- 

3 Pos as Multiheaded Clauses 

A Boolean function is positive if and only if every clause in its conjunctive normal 
form representation contains at least one positive literal. A clause is described as 
multiheaded if it contains one or more positive literals. In this paper, multiheaded 
clauses are written as implications with the body a conjunction of variables and 
the head a disjunction of variables. That is, a multiheaded clause has the form: 

yi A ... Ay„ ^ xiV ..V Xm 

Observe that the yi and the xj are distinct variables, otherwise the clause is 
equivalent to true. Let / G MHC denote that / is represented as a conjunction 
of multiheaded clauses. 

Proposition 1. For every / G Pos there is /' G MHC such that / = /'. 

Proof. It is well known that any Boolean formula is equivalent to another in 
conjunctive normal form. Suppose / = /', where f' is in conjunctive normal 
form. Since / is positive, every clause of /' must contain at least one positive 
literal, hence /' G MHC. ■ 

In the case that to = I the multiheaded clause is simply a propositional 
Horn clause. This suggests that the algorithms to calculate the domain oper- 
ations might perform well if they naturally specialise to efficient propositional 
Horn clause algorithms. This will be the case for entailment checking. Moreover, 
the multiheaded clauses representation is particularly amenable to widening. If 
widening is required, the representation may be restricted in linear time so that 
clauses with more than, say n, heads are discarded. If n = 1, this widening 
corresponds to restricting to Def. 

4 Domain Operations for Multiheaded Clauses 

This section gives algorithms for the domain operations of Pos represented as 
multiheaded clauses. Meet (A) is simply conjunction of clauses and is con- 
stant time; the other domain operations described are join (V), relative pseudo- 
complement (—>■), entailment checking (|=) and projection out (3). The algo- 
rithms form the basis of the groundness analyser whose implementation is de- 
scribed in the next section. 

4.1 Join 

Consider / = /i V / 2 , where /i ,/2 G MHC. Suppose f\ = ci A ... A c„ and 
f 2 = diA ... A dm. Then, distributing, f = f = Af^i{AjAi(ci V dj)). Suppose 
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Ci = yi /\ ... A yk ^ xi V ... V xi and di = u\ A ... A Up — 1 wi V ... V Vq. Then 
Ci V di = j/i A ... A yk A U\ A ... A Up ^ Xi V ... V x; V V ... V Vq G MHC. 
Hence /' G MHC. Since the above involves a quadratic blowup in the size of the 
representation, join is quadratic in the size of the input formulae. 



4.2 Relative Pseudo-complement 

Relative pseudo-complement has recently been used to support backward rea- 
soning. In particular to trace control flow backward (right to left) to infer moding 
properties of initial queries [20]. 

Consider / = /i — >■ /2, where /i,/2 G MHC. Suppose fi = Ci A ... A c„ 
and f 2 = di A ... A dm. Then, f = f = -1 dj)). Suppose Ci = 

yi A ... A j/fe — 1 xi V ... V xi and di = u\ A ... A rtp — 1 V ... V Vq. Then 

_ f A(^;l(Xi A Ml A ... A Mp — 1 Ml V ... V Vq) 

* (A Aj^i{ui A ... A Mp — >■ J/j V Ml V ... V Vq) 

Hence /' G MHC. Given that the size of /' is exponential in the size of /i, the 
operation is exponential. However, it should be noted that many analyses using 
positive Boolean functions (including groundness) do not require this operation 
to be calculated. In such cases the cost of this operation is not a drawback. 



4.3 Entailment Checking 

Entailment checking for positive Boolean functions represented in conjunctive 
normal form is co-NP complete [1] . However, as exploited in SAT solving, many 
of the Boolean functions that arise in practice can be checked for satisfiability 
with low complexity algorithms. This observation is exploited by the two algo- 
rithms detailed below. The first, entailslite, is incomplete and takes quadratic 
time in the size of the input. The second, entailsheavy, adds case splitting to 
the first algorithm to obtain completeness (which is required to guarantee ter- 
mination in the flxpoint engine). This stratified algorithm usually only requires 
entailslite to be invoked once. 

The entailslite algorithm (seen Figure 2) is an incomplete test that a mul- 
tiheaded clause is entailed by a conjunction of multiheaded clauses: A\^.^Bi -G 
Hi \= B ^ H , where B = yi A ... A yn and H = xi V ... V Xm- It works by 
propagating deterministic bindings in an attempt to detect contradiction. The 
algorithm terminates either when a contradiction is found or when no more 
bindings can be propagated: then Flag is returned. Notice that this algorithm 
contains forward chaining for propositional Horn clauses as a special case. Also 
notice that the variables are assigned values only once. The auxiliary rename 
produces a syntactic variant of a term which does not share any variables with 
the original term. 

The algorithm entailsheavy (see Figure 3) applies case splitting if entailslite 
does not detect entailment. The number of cases is potentially exponential in 
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process entailslite{/\\^iBi — >■ Hi, B — >■ H) 

Flag ~ false-, 

for i = 1 to m do Xi := false-, 
for j = 1 to n do yj := true-, 
for k = 1 to I do 

spawn forward{Bk, Hk, Flag)-, 
spawn backward{Bk, Hk, Flag)-, 
return Flag. 

process forward{B , H, Flag) 

block until every x £ B bound 

if A-B = true then spawn maketrue{H , Flag) 

else stop. 

process backward{B , H, Flag) 

block until every y £ H bound 

if VB = false then spawn makefalse{B, Flag) 

else stop. 

process maketrue{H = {j/i, ..., j/™}, Flag) 

block until yi £ H changes for some i £ {1, ...,m} 
if VB = false then Flag true-, stop 
else if VB = true then stop 

else if VB = yi for some i £ {1, ..., m} then yi := true-, stop 
else suspend. 

process makefalse{B = {*i, Flag) 

block until Xi £ B changes for some i £ {1, ...,n} 
if AB = true then Flag ;= true; stop 
else if AB = false then stop 

else if AB = Xi for some i £ {1, ..., n} then Xi -.= false; stop 
else suspend. 



Fig. 2. The entailslite Algorithm 



the number of variables left unbound by entailslite. However, propagation oc- 
curs after each binding, therefore deep case splitting is rarely required. A more 
intelligent splitting strategy (as in SAT solving) could be applied, but the naive 
strategy performs more than adequately. 

Proposition 2. The algorithm entailslite is sound, but not complete for entail- 
ment checking. The algorithm entailsheavy is both sound and complete. 



4.4 Projection 

As in [19], projection is calculated using a Fourier-Motzkin style algorithm. The 
projection of a single variable out of a pair of clauses, one of which contains the 
variable in the body and the other in its head is performed by syllogising as 
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process entailsheavy{F , f ) 

Flag ~entailslite{F , /); 

if Flag = true then return true 

else V := var(F’)={a:i, x„}; 

if P = 0 then return false 
else do 

rename(F’ A /)=(F' A /'); 

Flag' := entailsheavy{{x[ i-A true}F, {x'x i-A true}/)-, 
if Flag' = true then 

rename(F’A/)=(F’" A/"); 

return entailsheavy{{x'{ i-A false}F" , {x'{ i-A false}/") 
else return false. 



Fig. 3. The entailsheavy Algorithm 



follows: 



3z. 



f A ... A 7/p — >■ Z V V ... V Xq 
\^A z A j/p+i A ... A -1 Xq+i V ... V a: 




2/1 A ... A 7/„ -)> Xi V ... V Xm 



The correctness and completeness of this is easily confirmed using Schroder elim- 
ination, hence the algorithm below is also correct and complete. In general, each 
variable is eliminated in turn, as follows. Suppose z is to be projected out of /. 

1. All those clauses with z in the head are found, giving {Ci | z G /} where I 
is a (possibly empty) index set. 

2. All those clauses with z in the body are found, giving {Dj | j G J} where J 
is a (possibly empty) index set. 

3. These clauses of / are replaced by (Bz.Ci A Djji € I, j € J} 

4. A compact representation is maintained by eliminating redundant clauses 
(absorption). 



Step 4 means that the algorithm is parameterised by the compaction process. 
Compaction does not necessarily have to remove all redundant clauses (or indeed 
any), hence a tradeoff can be made between keeping the representation small 
and the cost of this maintenance. In projecting out a single variable, syllogising 
gives a quadratic blowup in the size of the representation. Thus the basic cost 
of projecting out a single variable is quadratic. However, the compaction step 
takes as its input a representation quadratic in the size of the original and the 
overall cost is dependent on the compaction algorithm. In the implementation, 
entailslite is used for compaction therefore the cost of projecting out a single 
variable is quartic. Because of the size blowup, projecting an arbitrary function 
onto a finite set of variables is exponential. 



5 A Pos-Based Groundness Analyser 

To assess the representation, two Pos-based groundness analysers built on multi- 
headed clauses were implemented in Prolog: one goal-dependent and one goal- 
independent. The analysers illustrate the ease with which the multiheaded clause 
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representation can be used. The analysers perform surprisingly well compared 
with other Pos analysers (including those with BDD-based Boolean function ma- 
nipulation coded in C) and compared with analysers using more computationally 
tractable domains. This section details the Prolog implementation. 

5.1 A GEP Representation 

As in [2,19], the analyser maintains a factorised representation, that is, as a 
product of subdomains. The factorisation is encoded in the call and answer pat- 
terns. A call (or answer) pattern is a pair (a, /) where a is an atom and / G Pos. 
Normally the arguments of a are distinct variables. The formula / is a conjunc- 
tion (list) of multiheaded clauses. In a non-ground representation the arguments 
of a can be instantiated and aliased to express simple dependency information 
[17]. For example, if a = p(xi, X5), then the atom p{x\, true, x\,Xi, true) 
represents a coupled with the formula {x\ GG x^) /\X2 /\ x^. This enables the ab- 
straction {p{xi, ...,x^), fi) to be collapsed to {p{xi,true, Xi,X4, true), f2} where 
/i = (xi O X3) A X2 A X5 A /2. This encoding leads to a more compact rep- 
resentation and is similar to the GER factorisation of ROBDDs proposed by 
Bagnara and Schachte [2]. The representation of call and answer patterns de- 
scribed above is called GEP (groundness, equivalences and propositional clauses) 
where the atom captures the first two properties and the formula the latter. 

The GEP representation is advantageous since it gives a compact represen- 
tation whilst incurring little overhead when the representation is non-ground. 
The compactness of the representation affects memory usage and the complex- 
ity of domain operations. As demonstrated in [17], many dependencies arising 
in groundness analysis fall into the GE component. By using the GEP represen- 
tation, many calls to expensive domain operations are avoided. Note that (as 
in [19]) the analyser does not maintain the factorisation strictly. Dependencies 
that could be encoded in the GE component may exist in the P component - 
the advantage of this is that the implementor may choose to update the GE 
component only when most computationally convenient. 

5.2 Domain Operations for the GEP Representation 

Meet. The meet of the pairs (ai, /i) and (02, /2) can be computed by unifying 
Oi and 02 and concatenating fi and f2- 

Renaming. The objects that require renaming are formulae and call (answer) 
pattern GEP pairs. If a dynamic database is used to store the pairs, then re- 
naming is automatically applied each time a pair is looked-up in the database. 
Formulae can be renamed with a single call to the Prolog builtin copy Term. 

Entailment. Entailment checking works on three levels each called under a 
negation so as not produce any problematic bindings. The first entailment check 
operates only on the GE component (and is complete for this component). En- 
tailment of the functions encoded in the GE component is denoted a\ ]= 02. To 
test this, bind each distinct variable in a\ to a distinct ground constant, resulting 
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in a'l - If, after this has been performed, a'^ may be unified with 02, then ai [=02. 
Otherwise oi ^02- The second entailment check is only applied to formula in the 
P component. This implements the (incomplete) entailslite algorithm described 
in section 4.3. The propagating processes are realised using block declarations. 
A single pass over the formulae sets up the process and each clause results in two 
processes at any one time. The cost of suspending and resuming these processes 
is constant time, so propagation is achieved with very little overhead. The third 
entailment check implements a variant of the entailsheavy algorithm described 
in section 4.3. Copy Term produces a renamed formulae with new variables such 
that if any of the original variables have processes blocked on them, then the new 
variables will have copies of the processes blocked on them. This saves repeating 
work in the calls to entailslite. 

Projection. Projection is only applied to formulae in the P component. It is 
performed using the algorithm given in section 4.4. Clauses produced by projec- 
tion that are equivalent to true (that is, the intersection of the head and body 
variables is nonempty) are immediately discarded. The compaction step is based 
on the entailslite algorithm. However, as the purpose of compaction is to prevent 
an explosion in the size of the representation, compaction is only performed if 
the representation after syllogising is larger than beforehand. Since entailslite is 
incomplete some redundant clauses may be retained, however this is more than 
compensated by the reduced complexity of compaction. 

Join. Calculating the join of the pairs (oi, /i) and (02, /2) is complicated by the 
way that join interacts with renaming. Specifically, in a non-ground representa- 
tion, call (answer) patterns would be typically stored in a dynamic database so 
that var(ai) H var(a2) = 0. Hence (ai,/i) (or equivalently (025/2)) have to be 
appropriately renamed before the join is calculated. This is achieved as follows. 
Plotkin’s anti-unification algorithm [22] is used to compute the most specific 
atom a that generalises oi and 02. (But observe that if oi ^ 02, 02 is a most 
specific generalisation of the atoms.) The basic idea is to reformulate oi as a 
pair (a(,/() which satisfies two properties: a{ is a syntactic variant of a; the 
pair represents the same dependency information as (ai,true). A pair (02, Z^) 
reformulating 02 is likewise constructed. The atoms a, o( and are unified and 
the formula /' = (/i A /() V (/2 A f^) is calculated. This calculation is filtered by 
entailment checking. If /i A /( ^ /2 A can be detected using entailslite, then 
/' = /2 A f'2 (and symmetrically). In this case the entailment check saves a call 
to join (and the associated projection) and the creation of a new data-structure, 
/'. Otherwise the join /' is computed as in section 4.1. Redundant clauses are 
removed from /' using entailslite to give /, and thereby the join (a, /). 



5.3 Fixpoint Algorithms 

The goal-dependent analyser is driven by an induced magic based iteration strat- 
egy, refining that used in [19]. Induced magic was introduced in [5], where a 
meta-interpreter for semi-nai've, goal-dependent, bottom-up evaluation is pre- 
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sented. Simple optimisations can significantly impact on performance. In partic- 
ular, as noted in [18], evaluations resulting from new calls should be performed 
before those resulting from new answers, and a call to solve for one rule should 
finish before another call to solve for another rule starts. These optimisations 
have been incorporated into the induced magic framework by using an explicit 
redo list storing those call and answer patterns which have changed, thereby 
defining the clauses which need to be reevaluated. The goal-independent anal- 
yser is based on semi-naive iteration. Neither of these analysers has exploited 
condensing [16,21]. 



6 Experimental Evaluation 

To assess the feasibility of multiheaded clauses as a representation of positive 
Boolean functions, the Pos-based groundness analysers were tested on a large 
benchmark suite. 

BDD representations of Boolean functions have been popular for the imple- 
mentation of Pos-based groundness analysers. For this reason an analyser using 
a BDD package has also been instrumented. The BDD package available does 
not employ a GER factorisation. However, it should be noted that turning off 
the GEP factorisation with the multiheaded clause analyser does not greatly 
affect its performance. This is a strength of clausal representations. An RGNF 
analyser was also implemented in Prolog to aid the assessment of MHG. The 
three goal-dependent analysers share the same fixpoint algorithm and therefore 
run in lock-step. 

The analysers are coded in SIGStus Prolog 3.8.6 with the exception for the 
domain operations for BDD-based Pos, which were written in G by Schachte 
[23], and compiled with 02 level of optimisation. The analysers were run on a 
296MHz Sun UltraSPARG-II with 1GByte of RAM running Solaris 7. Programs 
are abstracted following the elegant (two program) scheme of [4] to guarantee 
correctness. Programs containing disjunctions are normalised to definite clauses. 
Timeouts were set at two minutes. 

Table 1 presents the experimental results for the larger programs in the 
benchmark suite. The columns detail the following information, file: the pro- 
gram name; size: the number of abstract clauses; abs: the time require to read, 
parse, normalise and abstract the program. For goal-dependent analysis the fix- 
point times for the MHG, RGNF and BDD analysers are given, along with count: 
the number of ground argument positions in the call and answer patterns found 
by the analyser. For goal-independent analysis, the fixpoint times for MHG are 
given, along with the number of ground arguments in the success patterns. Time- 
out is denoted by The goal-independent counts are occasional larger than 
the goal-dependent counts owing to the presence of code unreachable from the 
initial query. 

Multiheaded clauses perform consistently better than RGNF for goal- 
dependent analysis. This is unsurprising given the cost of meet and the relative 
expense of equivalence checking via dual Blake canonical form, together with the 
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Table 1. Timing and Precision Results 



file 


size abs 


goal-dep. 

MHC RCNF BDD count 


goal-indep. 
MHC count 


bridge, clpr 


68 0.09 


0.00 


0.12 


0.03 


24 


0.08 


34 


conman.pl 


76 0.05 


0.00 


0.00 


0.03 


6 


0.01 


6 


unify.pl 


77 0.05 


0.07 


0.29 


0.08 


70 


0.09 


19 


kalah.pl 


78 0.05 


0.02 


0.11 


0.04 


199 


0.02 


42 


nbody.pl 


85 0.07 


0.05 


0.13 


0.06 


113 


0.04 


57 


peep.pl 


85 0.12 


0.03 


0.08 


0.04 


10 


0.02 


8 


sdda.pl 


89 0.06 


0.04 


0.07 


0.05 


17 


0.02 


4 


bryant.pl 


94 0.07 


0.32 


2.38 


0.15 


99 


0.28 


9 


boyer.pl 


95 0.08 


0.05 


0.07 


0.04 


3 


0.02 


5 


read.pl 


101 0.09 


0.05 


0.23 


0.08 


99 


0.03 


37 


qplan.pl 


108 0.09 


0.03 


0.25 


0.07 


216 


0.05 


27 


trs.pl 


108 0.13 


0.10 


2.28 


0.26 


13 


0.04 


7 


press.pl 


109 0.09 


0.11 


0.27 


0.12 


53 


0.04 


32 


reducer.pl 


113 0.07 


0.08 


0.17 


0.09 


41 


0.05 


21 


parser_dcg.pl 


122 0.09 


0.09 


0.29 


0.08 


43 


0.04 


24 


simple_analyzer.pl 


140 0.10 


0.16 


0.48 


0.13 


89 


0.10 


31 


dbqas.pl 


143 0.09 


0.03 


0.04 


0.04 


18 


0.03 


24 


ann.pl 


146 0.11 


0.16 


0.43 


0.10 


71 


0.09 


12 


asm.pl 


160 0.17 


0.05 


0.19 


0.09 


90 


0.14 


16 


nand.pl 


179 0.14 


0.05 


1.46 


0.14 


402 


0.68 


16 


lnprolog.pl 


220 0.10 


0.08 


0.19 


0.12 


143 


0.07 


31 


ili.pl 


221 0.15 


0.55 


1.63 


0.13 


4 


0.15 


5 


strips.pl 


240 0.22 


0.03 


0.07 


0.08 


142 


0.06 


36 


sim.pl 


244 0.22 


1.09 


24.78 


0.25 


100 


0.62 


33 


rubik.pl 


255 0.21 


0.22 


25.32 


0.20 


158 


0.16 


51 


chat_parser.pl 


281 0.36 


0.29 


1.75 


0.26 


505 


0.30 


128 


sim_v5-2.pl 


288 0.23 


0.07 


0.33 


0.16 


457 


0.10 


37 


peval.pl 


332 0.18 


0.64 


4.62 


0.16 


27 


1.30 


17 


aircraft.pl 


395 0.54 


0.15 


0.70 


0.41 


687 


0.12 


196 


essln.pl 


595 0.48 


0.19 


20.72 


0.37 


162 


0.30 


75 


chat_80.pl 


883 1.43 


0.88 


4.28 


0.84 


855 


0.64 


339 


aqua_c.pl 


3928 3.55 


7.68 


67.04 


- 


1285 


6.59 


458 



filtering applied to join in MHC. MHC compares favourably with BDDs, espe- 
cially considering that the BDD operations exploit memoisation and are coded 
in C. In terms of runtime, MHC and BDDs give similar results, although as 
would be expected, the different representations performed differently on differ- 
ent programs. For example, BDD perform well on sim.pl, whereas MHC perform 
well on sim_v5-2.pl. The MHC analyser appears to scale smoothly for both goal- 
dependent and goal-independent analysis. Of course, any Pos-based analyser can 
be broken using the schema from [6,14]; the analyser can deal with the arity 14 
case of [6] before timeout (that is, a single predicate requiring 16384 iterations). 

The major cost in entailment checking is incurred through case splitting in 
entailsheavy. Instrumentation has revealed that the total number of times entail- 
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slite is invoked in checking F \= f almost never exceeds |var(_F')|. Therefore in 
practice entailsheavy exhibits cubic behaviour in the size of the input formulae. 
Further instrumentation has shown that the maximum number of heads observed 
in a clause is four. These maxima occur infrequently. Since most clauses have few 
heads, typically only a small number of bindings have to be made before prop- 
agation binds sufficient variables to return the Flag. The calls to entailsheavy 
typically do not detect entailment, as the vast majority of entailments are de- 
tected using entailslite. As disentailment is demonstrated by the discovery of a 
single countermodel, the binding of a small number of variables to their value 
in a countermodel is often enough to generate the rest of this countermodel via 
propagation. This helps to explain the success of the stratified entailment check. 



7 Related Work 



The efficiency of groundness analysis depends on the way dependencies are repre- 
sented and implemented. The representation decides the algorithmic complexity 
of the domain operations but the implementation can introduce a prohibitive 
constant factor or even push the complexity into a higher class if there is not a 
good match between the representation and the implementation language. Effi- 
cient BDD-based Pos analysis are usually implemented in languages with muta- 
ble data-structures such as C [24] or SML [10,11]. State-of-the-art BDD-based 
groundness analysers employ a GE factorisation [2] which keeps simple definite 
information separate from dependency information. This leads to a particularly 
dense representation (meant informally, a small number of nodes/clauses in the 
representation) and is therefore an important implementation tactic. 

The density of the representation is as important to Prolog as it is to C: the 
density determines the size of the inputs to the domain operations, as well as im- 
pacting on memory usage. The dual Blake canonical form representation of Def 
functions [1,9] is attractive as it is amenable to Prolog implementation [12] and 
it gives a unique representation for every Def function (up to variable ordered) . 
However, its requirement to make transitive variable dependencies explicit can 
compromise density. For example, the function (x •<— y) A (y •«— z) is represented 
as (x ^ (y V 2 )) A (y ^ z). Because of this Howe and King [19] present a (non- 
orthogonal [1]) clausal representation of Def as conjunctions of propositional 
clauses, but do not maintain a canonical form. Therefore entailment checking is 
required to detect stability. 

Recently, Genaim and Godish [13] have proposed a dual representation for 
Def. For function /, the models of coneg{f) are named and / is represented by 
a tuple recording for each variable of / which of these models the variable is 
in. For example, the models of coneg{x — >■ y) are {{x, y}, {x}, 0}. Naming the 
three models a, 6, c respectively, / is represented by {ab, a) . This representation 
cleverly allows AGIl unification theory to be used for the domain operations 
and elegantly supports a GE factorisation. Promising experimental results are 
reported [13], but a widening is required to analyse the aqua_c benchmark. 
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Codish and Demoen [7] describe a model based Prolog implementation tech- 
nique for Pos that would encode xi O (X2AX3) as three tuples {true, true, true), 
{f alse, -, false), {false, false, -). The technique performs well against BDD- 
based Pos analysis of its era [24] but it does not scale smoothly to the larger 
benchmarks. Heaton et al. [17] therefore propose EPos, a sub-domain of Def, that 
can only propagate dependencies of the form {x\ O X2) A X3 across procedure 
boundaries. This information is precisely that contained in one of the fields of 
the GE factorisation. The main finding of [17] is that this sub-domain retains 
reasonably precision for goal-dependent analysis and possesses good scaling be- 
haviour. 

8 Conclusion 

Positive Boolean functions can be naturally expressed as multiheaded clauses 
which are straightforward to understand, manipulate and code in Prolog. Multi- 
headed clauses have been used as the basis for efficient goal-dependent and goal- 
independent Pos-based groundness analysers. The key to the success of these 
analysers is their constant time meet and their use of entailment checking suc- 
cinctly and efficiently coded using block declarations. Entailment checking is 
stratified so that many entailments are detected using a low complexity algo- 
rithm. The full exponential algorithm is only applied when necessary for detect- 
ing stability, and even then the number of case splits is typically very small. The 
analysers do not require widening for any of the benchmarks; however, natural 
widenings to Def or to EPos are available if required [6,14]. This work illustrates 
the subtlety of choosing a representation and its associated operations, even for 
a well known domain. Minor changes to the representation can have a signif- 
icant impact on performance if they affect frequently occurring operations. It 
also demonstrates the effectiveness of stratifying high complexity operations to 
avoid expensive computation whenever possible. The intelligent application of 
the simple entailment checking algorithm is the heart of the analyser presented 
in this paper. 
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Abstract. Groundness analysis of logic programs using Pos-based ab- 
stract interpretation is one of the clear success stories of the last decade 
in the area of logic program analysis. In this work we identify two prob- 
lems with the Pos domain, the multiplicity and sign problems, that arise 
independently in groundness and uniqueness analysis. We describe how 
these problems can be solved using an analysis based on a domain Size 
for inferring term size relations. However this solution has its own short- 
comings because it involves a widening operator which leads to a loss of 
Pos information. Inspired by Pos, Size and the LSign domain for abstract 
linear arithmetic constraints we introduce a new domain LPos, and show 
how it can be used for groundness and uniqueness analysis. The idea is to 
use the sign information of LSign to improve the widening of Size so that 
it does not lose Pos information. We prove that the resulting analyses 
using LPos are uniformly more precise than those using Pos. 



1 Introduction 

Groundness analysis of logic programs using Pos-based [7,18,19] abstract inter- 
pretation [9] is one of the success stories of the last decade in the area of logic 
program analysis. Moreover, Pos can be the basis for many other applications 
such as finiteness analysis, rigidity analysis, type analysis, and suspension anal- 
ysis (for concurrent logic programs). It has also been applied in the context 
of constraint logic programs to determine when variables have obtained a final 
unique value [2]. Here we shall refer to such an analysis as uniqueness analysis. 

The abstract domain Pos [7,18] consists of the positive Boolean functions, 
ordered by logical consequence. For groundness analysis, the idea is to express 
groundness information about a set of substitutions as a Boolean function. For 
example, the logical reading of the formula ip = x t\ (y ^ z) for groundness anal- 
ysis is: on application of any substitution, x is ground and if y becomes ground 
then also z does. More formally, we say that a Boolean function ip describes 
a substitution 6 if any set of variables that might become ground by further 
instantiating 0 is a model of ip. For uniqueness analysis, the situation is similar, 
but the Boolean function is interpreted as a statement about uniqueness depen- 
dencies. For example, if any two of the three variables in the linear constraint 
X = 2y + iz have a unique value then so does the third. This is described by 
{{x A y) — >■ z) A {{y A z) ^ x) A ((z A a;) — >■ y). 

Pos is attractive because its structure and corresponding abstract opera- 
tions are simple and Pos-based analysers are efficient in practice [12,22]. But 
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the domain has its limits when we consider precision of analysis. This paper ex- 
poses two shortcomings and illustrates their effect on groundness and uniqueness 
analyses for logic and constraint logic programs. The first problem relates to the 
fact that a Boolean function does not capture information about multiplicity of 
variables, so the term equations [x|xs] = [x|?/s] and [x|a;s] = [x,x\ys] are both 
approximated by {x A xs) eA (a; A ys). The second problem arises because Pos 
abstractions for arithmetic constraints ignore the signs of coefficients, so the sys- 
tems of constraints {x — y — z = Q) /\{x — y = i) and {x — y — z = Q) /\{x + y = i) 
are both given the Pos description (x ^ y) A {x ^ z). 

We look at several possible solutions to the two problems. For groundness 
analysis, the use of size relations [3,11,23] addresses the multiplicity problem, 
and can provide groundness information. A substitution is approximated by a 
constraint on term sizes. For example, the inequality x < y + z describes a 
substitution for which the size of the term x is bound to is smaller than or equal 
to the sum of the sizes of the terms y and z are bound to (for some appropriately 
chosen measure of term size). Groundness dependencies are captured, since any 
substitution which makes y and 2 ground must make x ground to satisfy the 
size constraint. While the abstract domain Size of size constraints can be shown 
to be at least as precise as Pos, it has infinite ascending chains and so must 
be applied in combination with a widening operator [11]. Widening sometimes 
results in the loss of groundness information so that in many cases a Pos based 
groundness analysis is more precise than a size relations based analysis. 

For uniqueness analysis, the abstract domain LSign [20], where linear con- 
straints are abstracted by replacing coefficients by their signs, can provide the 
needed information about signs of coefficients. Intuitively an LSign descrip- 
tion ©X + Qy + Qz = 0 A ©a; + (By = © defines constraints C equivalent 
to aix — G 2 y — a^z = 0 A 04^ + a^y = og, Oj > 0, 1 < i < 6, for example 
{x — y — z = 0)A(x + j/ = 3). The coefficient information is enough to determine 
the uniqueness information {x (A y) A {x (A z) . Unfortunately, LSign is aimed 
at capturing a different kind of information, namely whether a represented con- 
straint must be satisfiable or not, and LSign operations quickly lose information 
important to uniqueness analysis. 

Inspired by Size and LSign we propose a new domain, LPos, which improves 
on Pos, both for groundness analysis and for uniqueness analysis. The idea is to 
use the sign information of LSign to improve the widening of Size so that it does 
not lose Pos information. LPos uses the same style of description as LSign and, 
importantly, shares the use of an abstract Fourier elimination algorithm for the 
abstract projection operation. It differs crucially from LSign by using operations 
which maintain better uniqueness information. 



2 Problems with Pos 



Recall that a Boolean function is positive if it is satisfied by the assignment 
of true to all of its variables. The abstract domain Pos consists of the set of 
positive Boolean functions (together with false as a bottom element) ordered by 
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bf (Tree , List) : - 

bfq(s (zero) , [Tree I QT] ,QT List). 

bfq(zero,Q,Q, [] ) . 
bfq(s(N) , [nillQH] , QT, Es):- 
bfq(N,QH, QT, Es) . 

bfq(s(N) , [t(L,E,R) I QH] , [L,R|QT] , [E|Es]) 
bfq(s(s(N)) , QH, QT, Es) . 



bf(w, v) = 

3qt : bfq(true, u A qt, qt, v) 

Mq{w, x,y,z) = 

[w A {x y) A z] 

V [3n, qh ■. {w n) A {x qh) A 
b<iq(n,qh,y,z)\ 

V [3n, Z, e, r, qt, qh, es : 

(w -O' n) A (r -O' Z A e A r A qh) A 
(y -O' Z A r A gZ) A (t -O' e A es) A 
bfq(n, qh, qt, es)] 



Fig. 1. Breadth-first traversal of binary trees (left) with Pos analysis (right). 



logical consequence [18,7,19]. The abstract operations that arise in analyses are: 
Boolean conjunction, as a greatest lower bound, Boolean disjunction, as a least 
upper bound, and existential quantification, for projection. There is a substantial 
body of literature on this type of analysis, for example, [1,4,8]. In this section 
we describe two problems with the Pos domain. 



The multiplicity problem: The first problem arises in groundness analysis 
of logic programs. For groundness analysis, a Boolean function ip describes a 
substitution 6 if any set of variables that might become ground by further in- 
stantiating 0 is a model of ip. The unification of two terms t and s is described 
in Pos by the Boolean formula (xi A X 2 A . . . A Xn) aa (yi A 1/2 A ... A pm) where 
{x\,X 2 , ■ ■ ■ x„} and {yi,y 2 , ■ ■ ■ Pm} are the sets of variables in t and s. 

The multiplicity problem for Pos is a loss of groundness information because 
the abstraction function (of a Herbrand constraint) ignores the multiplicity of 
variables in terms. For example, the two term equations [a;|a;s] = [a:|ys] and 
[x|a;s] = [x,x\ys] are both described by the Boolean function (xAxs) AA (xAys). 
The inability of a Pos analysis to distinguish the two explains why it cannot find 
the (otherwise correct) description xs AA ys for the first equation. 

Of course neither equation is likely to appear in real-world programs. But a 
similar phenomenon is found, less transparently, in real programs. Consider the 
program given in Figure 1 (left side), for breadth-first traversal of binary trees. 
The program uses a queue of tree nodes which is represented in the first three 
arguments of bfq/4 — the first argument represents the number of elements in the 
queue, and the second and third represent a difference list of the elements on the 
queue. This setup ensures that the program can be executed in both directions. 
A groundness analysis using Pos consists of translating the program into the 
recursive definition of a Boolean function illustrated in Figure 1 (right side) 
and then finding a closed form using Kleene iteration. This yields a description 
w A (x AA (y A z)) for bfq(w, x, y, z) indicating that in any answer to a query to 
bfq/4, the first argument is ground, and the second is ground if and only if the 
third and fourth both are. This result is precise. But, consider now the analysis 
of the predicate bf/2. The result for bf(u, v) is obtained from the result for bfq/4 
as follows: 3qt : (u A qt) aa {qt A v) which is true. Hence the Pos-based analysis 
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mg(P,T,R,B) T = 1, B = (105/100)*P-R. 

mg(P,T,R,B) T > 1, T1 = T-1, PI = (105/100) *P-R, mg(Pl ,T1 ,R,B) . 

Fig. 2. Mortgage repayment in CLP(IRLi„). 



of bf/2 yields no useful groundness information, even though in fact the first 
argument to bf/2 is ground if and only if the second is. 

The sign problem: The sign problem for Pos arises in the context of uniqueness 
analysis of linear arithmetic constraints [6,2]. In this context, a Boolean function 
of the form {x/\y) — z expresses that if the values of the variables x and y become 
uniquely determined, then the value of the variable z will have been determined 
as a result. A constraint such as x = y — 1 is described by the Pos function 
X ^ y and a constraint such as x = j/ + 3z — 1 by {{x A y) -A z) A {{y Az)—>- 
x) A ((z A x) — >■ y). A problem is that the Pos abstraction of an arithmetic 
constraint ignores the signs of the coefficients in the arithmetic constraints. In 
particular, it is not possible for Pos to discover linear independence amongst 
constraints determined by these signs. 

Figure 2 shows a CLP(IRLi„) program to calculate mortgage repayments^. 
Uniqueness analysis using Pos begins by translating the program into a (possi- 
bly recursive) definition of a positive Boolean function, just as for groundness 
analysis. The difference is in the translation of the primitive constraints which 
are abstracted according to the following abstraction function: 

, \ _ j true if c is inequality 

^ ( V{vars{c)) if c is an equation 

where V(S') = A (( A x') — >■ v). The abstract version of Figure 2 is: 

v^S 

mg{p,t,r,b) = [t A (6 A r — >■ p) A (r A p — >■ 6) A (p A & — >■ r)j V [3ti , pi : 

(ti O t) A ((pi Ap) ->■ r) A ((p Ar) ^ Pi) A ((r A pi) ->■ p) A mg(pi,ti,r, b)] 

A closed form for mg(p, t, r, 5) is t A ((p A r) — >■ b). This expresses that any 
successful query will bind T to a unique value. Moreover, if P and R are given 
unique values in a query, then B will be determined as a result. Again, this result 
is less precise than we might have hoped for. If one queries mg with fixed values 
for all of the parameters, except for P, then P will be determined as a result. 
This information is not expressed in the result of the Pos analysis. 

The problem here is that Pos cannot identify linear dependencies between 
constraints in a conjunction. For example, both of the systems of linear con- 
straints: (x — y — z = 0) A (x — p = 3) and (x — y — z = 0) A (x -P y = 3) are best 
described by (x O y) A (x — >■ z). However, the former determines z while the 
latter does not. In fact, the uniqueness information in the two systems is best 
described by (x eA y) A z and (x eA y) A (x O z), respectively. 

The interest is fixed at 5% to avoid the presence of non-linear constraints. 



1 
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3 Size Relations 

An interesting solution to the multiplicity problem for Pos is obtained by ap- 
plying a size relations analysis [3,23], more commonly known in the context of 
termination analysis [5,16]. The interpretation of this type of analysis is para- 
metric to a given size function on terms, called a symbolic norm [16]. Symbolic 
norms (] • ]) are similar to linear norms except that variables are mapped to 
variables. In this approach, linear equalities and inequalities express constraints 
on the sizes of the terms bound to variables. For example, z = x + y describes 
those substitutions cr, for which every instance ctt satisfies j^crrj = \xgt\ + [ycrr]. 

Size relation analysis has many applications, and different norms support 
different uses. Here we are interested in an application for groundness informa- 
tion, so adopt a norm which assigns any ground term to the size zero, and a 
non-ground term to the sum of the sizes of its variables. We refer to this size 
function as the var-norm. 

Definition 1 (Var-norm). Given Herbrand term t, we define its var-norm by: 

{ 0 if t is a constant 

t if t is a variable 

\ti\.„ H h \tn\.n ift has form /(H, . . . ,t„) 

A size relation analysis for a logic program P can be formalised in terms of 
the semantics of a corresponding CLP(IN) program on the sizes of the terms 
in P. CLP(IN) stands for constraint logic programs over a domain consisting 
of the non-negative integers. The semantic objects are sets of CLP(IN) facts 
modulo an equivalence relation defined in terms of constraint equivalence. The 
meaning of a CLP(IN) program is a potentially infinite set of CLP(IN) facts. In 
practice, analyses apply a convex hull operation as an upper bound operator to 
improve efficiency, and a widening operator [11] (which detects stable edges in 
corresponding polyhedral representations of the constraints on the term sizes) 
to guarantee the termination of the analysis. 

There is a pleasing similarity between a size relations analysis and a ground- 
ness dependency analysis. Both fit into the framework of Giacobazzi et al. [15]. 
We can translate a (possibly recursive) definition of a predicate into a (possibly 
recursive) definition of a constraint, expressing the term size relations for the 
predicate. A term equation t = s is translated into jtj = jsj. For the var-norm, 
the bfq program in Figure 1 is translated to: 

hf{u,v) = 3qt : hfq{0,u + qt,qt,v) 
bfq(ru, x,y,z) = [ic = 0 A a; = y A 2 = 0] 

V [3n,qh : w = n A x = qh A bfq(n, g/i, y, z)] 

V [3n, I, r, e, qh, qt, es : 

w = n A x = l + r + e + qh A 
y = I + r + qt A z = e + es A bfq(n, y/i, yt, es)] 

It is easy to verify that ru = 0Aa; = y-|-zisa fixed point for bfq/4. 
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Size relations analyses also provide groundness dependencies. Let us make 
this intuition precise, without delving in the details. Consider a substitution a 
which is described by the conjunction of primitive CLP(IN) constraints each of 
the form: SjttjXj < b+ Skb^Xk where the coefficients aj and bk are positive. The 
groundness information contained in such a constraint can be expressed in Pos 
by: A{xk\k G K} — >• A {xj\j G J}. For example, if cr satisfies the size relation 
X < y + z and variables y and z become ground, then x must be ground as a 
result. Otherwise it would not be correct that any instance of a would map x to 
a term smaller than |j/|v„ + |z|„„. 

The groundness information we obtain through size analysis for bfq/4 in 
Figure 1 is identical to that obtained using Pos, namely wA (x gg yAz). However, 
the size analysis of bf/2 is illuminating. It shows how Size may be more precise 
than Pos for groundness dependencies. The CLP(IN) constraint is: bf(tt, w) = 
3qt : bfq(0, u + qt, qt, v) = 3qt : 0 = 0Au + qt = qt + v = u = v. That is, the 
groundness dependency u v can be inferred. This is more precise than what 
was obtained with Pos. The Size solution u = v was obtained because qt had 
the same coefficient on both sides of the equation u + qt = qt + v. With Pos we 
cannot distinguish this situation from u + qt = qt + qt + v, for example, and so 
the Pos result has to cover also the possible Size results u < v and u> v. 

Formalising the abstract interpretation between Size and Pos using the ad- 
joint framework [9] for abstract interpretation is not obvious, since the transla- 
tion described above does not give a best approximation. Consider for example 
the conjunction (x + y + y = z) A {x + y = z). The Pos information obtained 
from these constraints is {x A y) gg z, while that obtained from the normalised 
system x = zAy=0is the more precise (x GG z) A y. So we need to define a 
normalisation procedure in order to use the adjoint framework. Alternatively we 
can use a more general framework, such as the relational framework [17]. 

However, Size is not the panacea for groundness analysis that it might appear 
to be. As an abstract domain it has infinite ascending chains, and hence we need 
to apply a widening operator in an analysis. Consider the program 

listof (X,Xs) :- Xs = [X] . 

listof(X,Xs) :- Xs = [X|Ys], listof (X,Ys) . 



An analysis using Pos finds the groundness dependency x gg xs. An analysis 
using Size constructs the equation: 

listof (x, xs) = xs = X V [3?/s : xs = x + ys A listof (x, ys)] 

Interpreting V as the convex hull operation, Kleene iteration yields: 



listof^ (x,xs) 
listoC(x, xs) 
listoC(x, xs) 
listoC(x, xs) 



(xs = x) 

{xs = x) V [3ys : xs = X + ys A ys = x)] = (x < xs < 2x) 
(x < xs < 3x) 

(x < xs < 4x) 



To avoid traversing the infinite chain, it is customary to apply a widening oper- 
ator which retains only the stable part of the constraint, and the result x < xs 
is obtained as a fixed point. We have lost the groundness information x —> xs. 
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4 A More Precise Domain 

Inspired by the Size and LSign domains, we introduce a new abstract domain, 
LPos, that solves the problems discussed in Sections 2 and 3. The idea is to repre- 
sent linear constraints by their signs (abstract coefficients), but unlike LSign these 
abstract coefficients are introduced only by widening, and not at abstraction 
time. Basically, this enables us to preserve the Pos information which might be 
lost when applying the widening operation of the Size relations domain. For ex- 
ample, recall the analysis of the listof/2 program in Size. The size relations anal- 
ysis resulted in an infinite chain: x < xs < x , x < xs < 2x , x < xs < 3x , ... 
and so a widening was applied to get x < xs. With LPos the analysis gives 
X < xs < ©a;, where © stands for some positive coefficient. This ensures the ter- 
mination of the analysis and preserves the groundness dependencies otherwise 
lost by widening with Size. 

The LPos domain: We define a primitive LPos constraint c to be of the 
form: Si atXi op b where ai,b G IR U {©,©,T} and op G {<,=}. Note that 
the coefficients in an LPos constraint can be concrete (rationals) or abstract 
(signs). The abstract coefficients represent sets of concrete coefficients under the 
obvious mapping: 7 sign(T) = M, 7sign(©) = {a | a G M, a > 0}, 7sig„(©) = {a | 
a G IR,a < 0} and 7sign(o) = {a} otherwise. An LPos constraint is a conjunction 
of primitive LPos constraints. 

The elements of LPos are (equivalence classes of) sets (for disjunction) of 
LPos constraints. The concretization function for LPos, 7lp„s maps the elements 
of LPos to sets of concrete constraints. 

n n 

7lpos( a aiXi opb) = { S CiX^ op d\ct G 7sig„(ai),d € 7 sig„( 6 )} 

7lpos(ci a • • • a Cm) = {di a • • • a I di G 7 LP„s(ci), 1 < i < m} 

IlUD) = U 7lpos(C) 

CeD 

We consider concrete constraints modulo logical (constraint) equivalence. The 
concretization induces a partial order on LPos descriptions as follows: 

D\ I^LPos D 2 4=> 7 LPos(d-ll) C 7LPos(d?2) 

The join and meet operations are defined as follows: 

D\ Ulpos D 2 = Di U Z ?2 and D\ riLp „5 D 2 = {Ci A G 2 | Ci G Di, C 2 G d? 2 } 

Example 1. The LPos description {x < 0,2a; + (By < 4 A —x < 0} represents 
the set of constraints {x < 0} U {2a; + ay < 4 A —x < 0 | a > 0}. Hence 
2x + y < 4 A —X < 0 is represented by the LPos description, and, since it is 
logically equivalent, so is 2a; + y < 4 A —x < 0 A j/ < 5 A —x < 3. 

Abstract operations: The required operations on LPos to support groundness 
and uniqueness analyses are: conjunction, disjunction and projection. The first 
two are FIlp^s, and Uppos respectively. Projection is the only complex operation 
in this domain. It is defined as an extension of Fourier elimination, enhanced 
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to handle the abstract coefficients. Before defining the projection algorithm, 
we introduce arithmetic over MU {©,0,T}, and other operations needed for 
extracting the sign of the eliminated (by projection) variable’s coefficient. 

To eliminate variables we need to know their signs. Hence we define a map- 
ping that maps coefficients to their signs: 

Definition 2 (Sign abstraction, pos, neg). 

n 

pos{ E ttiXi op b) = {Xj I asign(oi) = 0} 



"^Sign 



(a)=' 



0 if a = Q 
0z/a>O ora = © 
© if a <0 or a = Q 
T if a = T 



i=l 



neg{ E ©Xi op b) = {x* | asign(ai) = ©} 



i=l 



Definition 3 (Arithmetic extension). Letai,a 2 G 1RU{©,0,T}, op G {x,+}. 
The results of the arithmetic operation 
oi op 02 and of —a\ (unary minus) are 
calculated using standard arithmetic if 
Oi and 02 are numbers, otherwise they 
are given as o;sign(ai) op asign(a 2 ) o,nd 
— asign(oi) 0 .S specified on the right: 

To obtain an accurate elimination procedure we need to know the sign of the co- 
efficient of the variable we are eliminating. In case the coefficient is abstract and 
its sign is unknown (that is, T), we split the constraint to obtain a disjunction 
of constraints in which that coefficient is known. 
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Definition 4 (Split). The splitting function on a coefficient j is defined by 
split(o) = 



), 0,0} if a = T 
{o} otherwise 

split( E OiXi op b,j) = {axj + 

i=l 

split(ci A 



E OjXi op b \ a G split(oj)} 



A Cm,j) = {A c(, I c(, G split(cfc, j)} 

k=l 



split(D,j)= U split(C,j) 

CeD 

Example 2. Given the LPos constraint C (with the constraint involving the co- 
efficient T highlighted in a box): 

(xi + 2x2 < 2) A (xi + 0X2 < ©) A (©xi 0 Tx 2 < 5) A (xi < 5) 
the result (the split constraints are indicated in the boxes) of split(C, 2) is: 

(xi 0 2x2 < 2) A (xi 0 0X2 < ©) A 

(xi 0 2x2 < 2) A (xi 0 0X2 < ©) A 

(xi 0 2x2 < 2) A (xi 0 0X2 < ©) A 




(©Xi 0 0X2 < 5) 



(©Xi 0 0X2 < 5) 



A (xi < 5) 
A (xi < 5) 



(©xi 0 0x2 < 5) A (xi < 5) 



Clearly split does not change the interpretation of an abstract constraint: 
Proposition 1. For all j: 7 lp„s(D) = 7 Lp„,(split(£», j)). 
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project(D, V) 

W := vars{D) — V 
while W do 
choose Xj (zW 
W — W-lxj} 

D := eliminate(_D,j) 
return D 



eliminate(_D, j) 

D' ■- 0 
foreach C £ D 

foreach C' € split(C', j) 

D' ■- D'ufourier(C",j) 
return D' 



combine(ci, C2,i) 

let E UiXi op b = Cl 

i=l 

let E UiXi op^ = C2 

i=l 

if CtSign(Ojj X Ctj) ^ © 

return 0 
for i := 1 to n 

Oi ;= a| X {—Oj) + of X Uj 
aj 0 

b := b^ X {—Oj) + b"^ X Oj 
if op^ = ‘ = ‘ and op^ = ‘ = ‘ 
op ■.='=' 
else 

op ■.='<' 

n 

return! ^ 



fourier(( 7 , j) 

Go := {c I c € C and Xj 0 pos{c) U neg{c)} 
C® := {c I c € G and Xj € pos{c)} 

Cq := {c I c G G and Xj £ neg{c)} 
foreach ci G G® 
foreach C2 G G® 

Go := Go U combine(ci, C2,i) 
return Go 

negate(c) 

n 

c = {E OiXi = b) 

i=0 

for i := 1 to n 

aj := —Ui 
= -b 

n 1 , 

return E UiXi = b 

i=0 

fouriereq(G, ji) 

Go := {c I c G G and Xj ^ pos{c) U neg{c)} 
C® := {c I c G G and Xj G pos{c)} 

Cq := {c I c G G and Xj G neg{c)} 
foreach Cl G (G® U Cq) 

n - - 

let E aiXi op h = ci 

i=\ 

C\ \ '.= Cl \ 

if op = = 

if Qsign (aj) = © c ~ negate(ci) 
else c!" := negate(ci) 
foreach C2 G (G® U G®) — {ci} 

^ o o o 

C2 = E afxi op b 

i = l 

if Qsign(flj) = © 

Go = Go U combine(c2, cj", j) 
else Go = Go U combine(c!", C2, j) 
return Go 



Fig. 3. Projection algorithm for LPos 



We can now define projection. project(D, V) projects an LPos description D, 
involving only inequalities, onto a set of variables V. Its definition is given in 
Figure 3. Note that we can always represent an LPos description using only in- 
equalities by replacing an equality by two inequalities. Equalities can be handled 
directly by replacing fourier(G, j) by fouriereq(G, j). 

Example 3. Consider the LPos description D: 

( Cl : xi+ 2x2 <2 A xi + ®X2 <©A ©a;i+Ta:2<5Axi<5'l 

! ©2 : a;i < 0 j 

The operation project(£), {xi}) proceeds by eliminating the variable X 2 as fol- 
lows: First we apply eliminate(T), 2) which chooses Gi. Splitting Gi gives the 
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constraints described in Example 2. The call to fourier(C'ip, 2) breaks up Cip 
according to the sign of X2, into 



Ca 



X\ + 2X2 <2, X\+ ©X2 < 0, 
©Xi + ©X2 < 5 



Co = { XI < 5 } 



and yields {xi < 5 }. fourier(Ci^ 2 ) 2 ) breaks up Ci_2 by the sign of X2, into 






xi + 2x2 < 2 1 
Xi + ©X2 < © J 



Cq — { ®Xi + ©X2 ^ 5 } 



Co = {xi <5} 



and returns {xi < 5 A ®xi < © A ©xi < T}. Now, the call to fourier(Ci, 3 , 2 ) 
breaks up by the sign of X2, into 



Cff 



xi + 2x2 < 2, 
Xi + ©X2 < © 



Co = { xi < 5 , ©xi + 0 x 2 < 5 } 



and it yields {x\ < 5 A ©xi < 5 }. The returned values from fourier, together 
with xi < 0 (the result for C2) make up the result of eliminate: 

f xi < 5 , xi < 5 A ©xi < © A ®xi < T 1 

( xi < 5 A ©xi < 5 , xi < 0 j 



Widening: The LPos domain contains infinite ascending chains, so we need 
to use widening. The idea behind the widening is to join abstract primitive 
constraints in such as way that no groundness or uniqueness dependency is lost. 

Definition 5 (Sign equivalence). Given two abstract primitive LPos con- 
straints: 




i—Q i—0 



we say that c\ and C2 are sign equivalent, denoted ci ~ C2 if and only if: opi = 
op2, asig„(a*) = asign(aD (0 < i < n), and asig„(^i) = agig„(&2)- Two LPos 
constraints (conjunctions) Ci and C2 are sign equivalent, if for each ci € C\ 
there is a C2 € C2 such that ci ~ C2, and vice versa. 

Example j. Consider the following two abstract LPos constraints: 

Cl C2 C3 C4 

Cl = X + ©y <10 A X — y <5 and C2= x + 4 y <4 A x + ©y < 1 
ci,C3 and 02,04 are sign equivalent, so Ci and C2 are sign equivalent. If we add 
05 = X < © to Cl, then Ci and C2 are no longer sign equivalent, because 05 is 
not sign equivalent with 03 or 04 . 

Proposition 2. Given a finite set of variables V, there are only finitely many 
non-sign equivalent abstract LPos constraints. 

The “widening” that we apply is based on the notion of sign equivalence. This 
operation is applied when two constraints are sign equivalent, so we do not lose 
the sign information. This is important for groundness and uniqueness analysis. 
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Definition 6 (Widening coefficients). Let oi,a2 G IRU{©,©,T}. Define 

Oi U 02 = Oi if ai = 02, otherwise Oi U 02 = asign(oi) */asign(ai) = asign(a2), 
otherwise oi U 02 = T. Extend this to sign equivalent eonstraints: 



E a\xi op bi 

i—1 



u 



E afxi op 62 

i—1 



(c( A • • • ) U (cf A • • • c^) = A{c- u Cj I c: 



= r 
2 = 1 

-j 



(o- U af)xi op (61 U 62) 



Finally we obtain a widening operation, by widening sign equivalent eonstraints 
in the descriptions. 



widen(Di, D2) — {Di \ {Ci | C\ G Di, 3 C *2 G D2, C\ ~ C2}) 
U {D2 \ {C2 I C2 G D2, 3 Ci G Di,Ci ~ C2}) 
U {Cl U C2 I Cl G Di,C 2 G D 2 ,Ci ~ C2} 



Example 5. Let LPos constraints Ci and C2 be as in Example 4 . Then 
widen({Ci}, {C2}) = {x + ©y < © A x + ©y < ©}. 



5 LPos instead of Pos 



In this section we describe the use of LPos for groundness and uniqueness anal- 
ysis, and show that it is uniformly more accurate than the corresponding Pos 
analyses. Before going into details we define a splitting function on all variables, 
which is based on the split given in Definition 4 . We use this to map the LPos 
elements to their Pos description, in both groundness and uniqueness analysis. 
The split function effectively replaces all T coefficients by the three choices they 
represent {©,©,0}. 

Definition 7 (Splitting on all variables). Given an LPos conjunction C , we 
define the splitting of C on all its variables as follows: 

splitall(ci A • • • A Cm) = {c'l A • • • A c'm \ c' G splitall(cd} 

n n 

splitall( E OiXi op b) = splitall*( E OiXi op b,n) 

i=l i=l 

( {c} if n = 0 

splitaH*(c, n) = < |J splitaH*(c', n — 1) otherwise 

y c'esplit(c,n) 



Example 6. Consider the LPos constraint (Bx + Ty < ©ATx + ©y = ©. Splitting 
on all variables, that is, on x and y, gives a set of nine abstract constraints: 
splitall(C) = 



' ©X + Oy < © A Ox + ©y = ©, 
©X + ©y < © A Ox + ©y = ©, 

< ©X + ©y < © A ©X + ©y = ©, 
©X + Oy < © A ©X + ©y = ©, 
^ ©X + ©y < © A ©X + ©y = © 



©X + ©y < © A Ox + ©y = © 
©X + Oy < © A ©X + ©y = © 
©X + ©y < © A ©X + ©y = © > 
©X + ©y < © A ©X + ©y = © 



By Proposition 1 we can eliminate T coefficients using splitall. ^From here on we 
therefore assume that no LPos description involves T. 
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Groundness analysis of logic programs: Given a logic program P, the 
meaning of P over the LPos domain is obtained in two stages: the first is ab- 
stracting the Herbrand terms to abstract LPos constraints using the var-norm 
function, and the second is computing the meaning of the program. For simplic- 
ity, we will use a Kleene iteration combined with widening, in order to compute 
an approximation of the least fixed point of the equation generated by the ab- 
straction function. 

The abstract LPos version P' of a normalised logic program P is obtained as 
follows: For each clause C = p{x) ^ 6i, . . . , we generate an equation: 

p{x) = 3x : b'l A . . . Ab'^ A /\{v G vars{C) \ {v > 0)} 

where each 5' is a translation of bi according to this rule: 6' = var-norm (s) < 
var-norm(t) A var-norm(t) < var-norm(s) if bi = {s = t), otherwise 6' = bi. The 
operations A and 3 are FIlpos and project respectively. The join operation will be 
denoted by V. The Kleene iteration is performed in the standard way, but after 
each step (or finitely many steps) we apply a widening. 

Similar to Size, LPos objects describe groundness and can be translated to 
Pos descriptions according to the description function defined below. 

Definition 8 (Description function ag). We can extract groundness infor- 
mation from an LPos object as follows: 

{ V {/\neg{c') /\pos{c')) if c is an inequality 

c'Gsplitall(c) 

V {/\neg{c') eA /\pos{c')) if c is an equation 

c'Gsplitall(c) 

n 

o:g{C) = /\ ag{ci), where C = ci A • • • A c„ 

ag{D)= V ag{C) 

C&D 

Example 7. Let us (re)analyse the listof program, this time using LPos, and show 
how LPos gives better information than Size. First we use the var-norm function 
to abstract the program to the following recursive equations: 

listof(a;, xs) = —x < 0 A —xs <Q A xs = x 

V : —X < 0 A —xs < 0 A —ys < Q A xs = x ys A listof (a;, ys)] 

and two steps of Kleene iteration yield: 

listof xs) = \—x < 0 A —xs < 0 A a:s = x] 

listof^(a:, xs) = \—x < 0 A —xs < 0 A a:s = a;] 

V [3ys : —X < 0 A —xs < 0 A —ys < 0 A xs = x -\- ys A ys = x)] 

= [—X < 0 A —xs < 0 A a:s = x] V [—x < 0 A —xs < 0 A xs = a; 3- x] 

Applying a widening on listof^(x, xs) and listof^(x, xs) gives 

C = —X < 0 A — xs < 0 A xs = ©X 

which is a fixed point, and its Pos description is ag{C) = xs x. 
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Lemma 1. Let D\, D 2 be elements of LPos and V a set of variables. Then: 
(a) ag{Di Ulpos D 2 ) ag{Di) V cxg{D 2 ) and ag{Di Hlpos £> 2 ) ag{Di) A 
Oig{D 2 ); (b) o;g(project(£)i, V)) ^ 3V : o;g(£>i); and (c) ag{Di Ulpos £> 2 ) = 
ag(widen(£)i, £> 2 )). 

Theorem 1. (a) ag correctly (and accurately) extracts groundness information 
from LPos descriptions; and (b) LPos is uniformly at least as accurate as Pos 
for groundness analysis of logic programs. 

Uniqueness analysis: With LPos, the analysis of a constraint program has two 
stages, just like groundness analysis has. Only the abstraction function differs. 
Given a normalised linear arithmetic constraint program P, an abstract LPos 
version P' is obtained as follows: For each clause p{x) 1 — 61 , . . . , generate the 
equation: p{x) = 3x : b'l /\ ... /\ b'.^ where 6 ' is a translation of bp. if bi is an 
equation or a call, b[ = bi, otherwise 5' = true. The operations A, V and 3 are 
the same as in groundness analysis. And Kleene iteration is performed similarly. 
It is clear that the LPos object obtained in this analysis, is an approximation 
of the concrete object of P, so a uniqueness analysis is obtained by using the 
following description function, below. 

Definition 9 (Description function Om)- can extract uniqueness infor- 
mation from an LPos objects as follows: 

{ true if C is an inequality 

V V{vars{c)) if C is an equation 
cGsplitall(C) 

71 

ctu{C) = /\ au{ci), where C = ci A • • • A c„ 

i=l 

au{D) = V cXu{C) 

CeD 

Example 8. Using LPos, the abstract version of Figure 2 is 

mg{p,t,r,b)= [{t=l)A{b = cp-r)] 

V [3ti,pi : (ti = t - 1) A (pi = cp-r) A mg(pi, £, r, 6 )] 

Kleene iteration proceeds as follows: 

mgi {p, t,r,b)= [{t = 1) A {b= cp- r)] 

mg^ {p, t, r,b)= [{t=l) A {b=cp-r)] 

V [3ti,pi : {h = t - 1) A {pi = cp - r) A (ti = l) A (6 = cpi - r)] 
= [{t = 1) A {b= cp — r)] 

V [{t = 2) A (6 -I- (1 -I- c)r — c^p) = 0] 

and a widening on mg^{p,t,r,b) and mg‘^{p,t,r,b) yields: 

mg^{p,t,r,b) = (t = ©) A (&-I- ©r-L 0p = 0) 
which is a fixed point. Its description in Pos is 

t A {{p A r) -A b) A {{b A r) -A p) A {{b A p) -A r) 
which is more precise than the one we obtained using Pos, namely tA{{pAr) -A b). 
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Lemma 2. Let D\ and D 2 be elements of LPos and V a set of variables. Then: 
(a) au{Di Ulpos D 2 ) ^ au{Di) V Om(L> 2 ) and Hlp„ D 2 ) ^ oiu{Di) A 

otu{D2); (b) a„(project(Di, V)) ^ 3V : a„(L>i); and (c) a„(£>i Ulpo^ D2) = 
aM(widen(Hi,£)2)). 

Theorem 2. (a) a„ correctly (and accurately) extracts uniqueness information 
from LPos descriptions; and (b) LPos is uniformly at least as accurate as Pos 
for uniqueness analysis of CLP(IRLi„) programs. 

6 Conclusion 

We have shown how the well-known Pos domain occasionally is less than ade- 
quate for groundness and uniqueness analysis. For example, a Pos-based analysis 
of the celebrated “mortgage” program exhibits a significant loss of precision. 

A partial solution can be obtained by using instead a domain Size of term size 
constraints. However, we have shown that this solution is not entirely satisfac- 
tory. In fact even the reduced product [10] of Pos and Size has shortcomings for 
uniqueness analysis. For this reason, inspired by the LSign domain, we proposed 
LPos as an improved Pos for groundness and uniqueness analysis. 

A different attempt to improve the precision of Pos-based analysis was made 
by File and Ranzato [13] who considered the use of the powerset of Pos, P(Pos). 
However, while this increases the expressiveness of the abstract domain, it can 
be argued that groundness analysis cannot capitalise on the improved precision, 
since the Pos content of the result of a P(Pos) analysis is exactly that of a 
Pos analysis [14]. (This is interesting because it differs from the situation where 
a smaller domain is chosen as starting point, for example, the result of a Def 
analysis [1] may be a less precise than the Def content of a P(Def) analysis). 

Marriott and Stuckey originally introduced the LSign domain [20] to derive in- 
formation about the satisfiability of linear arithmetic constraints. LSign could be 
used for groundness and uniqueness analysis also, but it is less precise than what 
we can obtain in LPos. LPos extends the original LSign in the same direction as 
the Lint interval improvement of LSign [21], by using both concrete coefficients 
(size 0 intervals) and abstract coefficients (semi-infinite intervals) and sets of 
abstract constraints. Unlike LSign and Lint, LPos does not use abstract Gauss- 
Jordan variable elimination. Interestingly, using this elimination algorithm, the 
accuracy theorems fail to hold. The Fourier elimination creates redundant con- 
straints, and these are anathema to the LSign goal of detecting satisfiability, but 
they do not reduce the accuracy of the resulting LPos descriptions. 
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Abstract. Justifying the truth value of a goal resulting from query eval- 
uation of a logic program corresponds to providing evidence, in terms of a 
proof, for this truth. In an earlier work we introduced the notion of justi- 
fication [8] and gave an algorithm for justifying tabled logic programs by 
post-processing the memo tables created during evaluation. A conservatve 
justifier such as the one described in that work proceeds in two separate 
stages: evaluate the truth of literals (that can possibly contribute to the 
evidence) in the first stage and construct the justification in the next 
stage. Justifications built in this fashion seldom fail. Whereas for tabled 
predicates evaluation amounts to a simple table look-up during justifica- 
tion, for non-tabled predicates this amounts to Prolog-style re-execution. 
In a conservative justifier a non-tabled literal can be re-executed causing 
unacceptable performance overheads for programs with significant non- 
tabled components: justification time for a single non-tabled literal can 
become quadratic in its evaluation time! 

In this paper we introduce the concept of a speculative justifier. In such 
a justifier we evaluate the truths of literals in tandem with justification. 
Specifically, we select literals that can possibly provide evidence for the 
goal’s truth, assume that their truth values correspond to the goal’s and 
proceed to build a justification for each of them. Since these truths are 
not computed before hand, justfications produced in this fashion may 
fail often. On the other hand non-tabled literals are re-executed less of- 
ten than conservative justifiers. We discuss the subtle efficiency issues 
that arise in the construction of speculative justifiers. We show how to 
judiciously balance the different efficiency concerns and engineer a spec- 
ulative justifier that addresses the performance problem associated with 
conservative justifiers. We provide experimental evidence of its efficiency 
and scalability in justifying the results of our XMC model checker. 



1 Introduction 

Query evaluation of a goal with respect to a logic program establishes the truth 
or falsehood of the goal. However the underlying evaluation engine typically 
provides little or no information as to why the conclusion was reached. This 
problem broadly falls under the purview of debugging. Usually logic programs 

* Research partially supported by NSF awards EIA-9705998, CCR-9876242, IIS- 
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are debugged using trace-based debuggers (e.g. Prolog’s four-port debugger) 
that operate by tracing through the entire proof search. Such traces are aided 
through several navigation mechanisms (e.g. setting breakpoints or spy points, 
skips, leaps, etc.) provided by the debugger. 

There are several reasons why trace-based debuggers are cumbersome to 
use. Firstly, they give the entire search sequence including all the failure paths, 
which is essentially irrelevant if the user is only interested in comprehending 
the essential aspects of how the answer was derived. Secondly, the proof search 
strategy of Prolog, with its forward and backward evaluation, already makes 
tracing a Prolog execution considerably harder than tracing through procedural 
programs. The problem is considerably exacerbated for tabled logic programs 
since the complex scheduling and fixed-point computing strategies of tabled res- 
olution makes it very difficult to comprehend the sequence produced by a tracer. 
Finally, from our own experience with the XMC model checker [1] (which is an 
application of the XSB tabled logic programming system [11]) trace-based de- 
buggers provide no support for translating the results of the trace (which is at 
the logic program evaluation level) to the problem space (e.g. CCS expressions 
and modal- mu calculus formulas in XMC). 

In [8] we proposed the concept of a justifier for giving evidence, in terms of 
a proof, for the truth value of the result generated by query evalaution of a logic 
program. The essence of justification is to succinctly convey to the user only 
those parts of the proof search which are relevant to the proof/disproof of the 
goal. For example, if a query is evaluated to true, the justifier will present the 
details of a successful computation path, completely ignoring any unsuccessful 
paths traversed. Similarly, when a query is evaluated to false, it will only show 
a false literal in each of its computation paths, completely ignoring the true 
literals. Figure 1 is an illustration of justification, where the predicate reach/2 
(Figure la) is tabled. Evaluation of the query reach(a,d) generates a forest of 
search trees (Figure lb), (See [12] for an overview of tabled evaluation.) 

Although justification is a general concept, the focus of our earlier work in 
[8] was on justifying tabled logic programs. Towards that end we presented an 
algorithm for justifying such programs by post-processing the memo tables cre- 
ated during query evaluation. To justify the answer to a query some “footprints” 
need to be stored during query evaluation. The justifier uses these footprints to 
extract evidence supporting the result. The naturalness of using a tabled LP sys- 
tem for justification is that the answer tables created during query evaluation 
serve as the footprints. Indeed during query evaluation the internally created 
tables implicitly represent the lemmas that are proved during evaluation. By 
using these lemmas stored in the tables, the justifier presents only relevant parts 
of the derivation to the user. In other words the additional information needed 
for doing justification comes for “free”. Thus justification using tabled logic pro- 
gramming system is “non-intrusive” in the sense that it is completely decoupled 
from query evaluation process and is done only after the latter is completed. 
More importantly, justification is done without compromising the performance 
of query evaluation. 
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reach (b, d) 



arc (a, D) , reach (D, d) 
reach(b,d) reach(c,d) 



arc (b, D) , reach (D, d) 
D/a| 



reach(c,d) reach(d,d) 

arc(c,d) arc (c , D) , reach (D, d) arc(d,d) arc (d, D) , reach (D, d) 



: - table reach/2 . 

reach(A,B) arc(A,B). 

reach (A, B) arc(A,D), reach (D,B). 

arc (a, b) . 

arc (a, c) . 

arc (b, a) . 

arc (c, d) . 

(a) 



reach (a, d) 
arc (a, c) reach (c, d) 



fact 



arc (c , d) 



Fig. 1. Justifying reach(a,d): (a) Logic Program (b) Forest of Search Trees (c) Jnstifi- 
cation 



Justifying the truth value of a given literal which we will denote as the goal, 
amounts to providing a proof that usually will involve searching for other literals 
relevant to the proof, knowing their truth values, justifying each such truth value 
and putting them all together to produce a justification of the goal’s truth. For 
some of them we may fail to produce justifications relevant for justifying the 
goal. In Example 1 below the clause p : - r is irrelevant for justifying p is 
true since the failure of r is not the correct evidence for p’s truth. Had we 
selected this clause and proceeded to build a justification for r we would have 
eventually discovered that it is irrelevant. Thus avoiding irrelevant justifications 
is an important parameter in the design of justification algorithms. 

Example 1 Consider the following logic program: 
p r. p t. 

r : - .... fail . 
t . 

The justification algorithm in [8] yields a conservative justifier in the sense 
that by design it is geared towards limiting such wasteful justifications. It does so 
by evaluating the truth of literals (that can possibly provide supporting evidence 
for the goal’s truth) in the first stage. Armed with the needed truths, in a separate 
second stage it proceeds to construct their justifications. By evaluating the truth 
of r before hand upon selecting the clause p : - r in Example 1, we can avoid 
building the justfication of r to support the truth of p and fail eventually. 

The algorithm in [8] implicitly assumed that all the predicates in the program 
are tabled. But real-life logic programs consist of both tabled and non-tabled 
predicates. How does it handle such programs? Whereas for tabled predicates 
evaluation is a simple table look-up during justification, for non-tabled predicates 
this amounts to Prolog-style re-execution. In a conservative justifier, justification 
of a non-tabled literal can trigger repeated evaluations of other non-tabled literals 
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on its proof path, causing unacceptable performance overheads for programs 
with significant non-tabled components. Specifically the time taken to justify 
the truth of a single non-tabled literal can become quadratic over its evaluation 
time! In fact on large model checking problems our XMC model checker took 
a few minutes to produce the results whereas the justifier failed to produce a 
justification even after sevaral hours! 

In this paper we explore the concept of a speculative justifier to address the 
above performance problem associated with a conservative justifer. The idea un- 
derlying such a justfier is this: When we select a literal as a possible candidate 
for inclusion in the justificaton of the goal’s truth we speculate that it will be 
relevent and proceed to build its justification. Since we do not know its truth 
value before hand we may discover eventually that we are unable to produce a 
justification for it that is relevant for justifying the goal’s truth (such as the jus- 
tification of r in Example 1). On the other hand if we never encounter any such 
literal then for a non-tabled literal we have built its justification without having 
to repeatedly traverse its proof path. But doing speculative justification naively 
can result in failing more often and thus offset any gains accrued by avoiding re- 
peated re-executions of non-tabled literals. In this paper we discuss these subtle 
efficiency issues that arise in the design and implementation of speculative jus- 
tifiers. We show how to judiciously balance the different efficiency concerns and 
engineer a speculative justifier that addresses the performance problem associ- 
ated with conservative justifiers. The rest of the paper is organized as follows. In 
Section 2 we review the concept of justification. Section 3 reviews conservative 
justifier. In section 4 we present the design of a speculative justifier. In Section 
5 we discuss its implementation and practical impact on real-world applications 
drawn from model checking. Discussion appears in Section 6. The technical ma- 
chinery developed in this paper assumes definite clause logic program. Extensions 
are also disussed in Section 6. 

Related Work. A number of proposals to explain the results of query evalu- 
ation of logic programs have been put forth in the past. These include algorith- 
mic debugging techniques [10], declarative debugging techniques [4,6], assertion 
based debugging techniques [7], and explanation techniques [5]. A more detailed 
comparison between justification and these aproaches appears in our earlier work 
[8]. Suffice it is say here that although justification is similar in spirit to the above 
approaches in terms of their objectives it differs considerably from all them. It 
is done as a post-processing step after query evaluation, and not along with 
the query evaluation (as in algorithmic and assertion-based debugging) or be- 
fore query evaluation (as in declarative and assertion-based debugging). Unlike 
declarative debugging justification does not demand any creative input from the 
user regarding the intended model of the program which can be very hard or 
even impossible to to do as will be the case in model checking. But beyond all 
that this paper examines effciency issues that arise in justifying logic programs 
consisting of both tabled and non-tabled predicates - a topic that has not been 
explored in the literature. 
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2 Justification 

In this section we will recall the formalisms developed in [8] for justification. 
We generalize them here in order to deal with mixed programs containing both 
tabled and non-tabled predicates. 

Notational Conventions We use P to denote logic programs; HB{P), M{P) to 
denote the Herbrand Base and least Herbrand model respectively; A and B to 
denote atoms or literals; a to denote a set of atoms or literals; f3 to denote a 
conjunction of atoms (a goal is a conjunction of atoms) or literals; 9 to denote 
substitutions; to denote atom subsumption {A > B for A subsumes B)\ 
and C to denote a clause in a program. For a binary relation R, we denote its 
(reflexive) transitive closure by R* . 

Definition 1 (Truth Assignment) The truth assignment of an atom A with 
respect to program P, denoted by T(^p){A), is: 

- yalse ye A6 0 M{P) 



We drop the parameter P and write the truth assignment as r(A) whenever 
the program is obvious from the context. Let A be an answer to some query 
in program P, i.e., r(A) = true. We can complete one step in explaining this 
answer by finding a clause C such that (i) A unifies with the head of C, and (ii) 
each literal B in the body of C has t{B) = true. If A is not an answer to any 
query, i.e., t{A) = false, we can explain this failure by showing that for each 
clause C whose head unifies with A, there is at least one literal B in C such 
that t{B) = false. We call such one-step explainations as a locally consistent 
explanations (lee’s), defined formally as follows: 

Definition 2 (locally consistent explanation (Ice)) Locally consistent ex- 
planation for an atom A with respect to program P, denoted by f(^p-^{A), is a 
collection of sets of atoms such that: 

1 . If t{A) = true : 

^(P)(T) = {qi, 02 , . . . , Qm} , with each Oi being a set of atoms {Bi,B 2 , 

. . . , B„} such that: 

(a) V 1 < j < n T{Bj) = true, and 

(b) 3 C = A' (3 and a substitution 9 such that A' 9 = A and P9 = (Bi, 

i?2, • • • , Bn)9 . 

2 . If t{A) — false : 

5(P)(A) = {I/}, a singleton collection where L — {B\, B 2 , . . . , Bn} is the 
smallest set such that : 

(a) VI < j < n T(Bj) = false, and 

(b) V substitutions 9 and C = A' (B[, B' 2 , . . . , B}) , A'9 = A9 
31 < fc < Z such that B'i.9 G L and V 1 < i < fc t{B[ 9) — true. 
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^{reach{a,d)) = {{arc{a,c), reach{c,d)}} 

^{reach{c, d)) = {{arc(c, d)}} 
darc{c,d)) = {{}} 

^{reach{a,c)) = {{arc{a,c)}, {arc{a,b),reach{b,c)}} 

(a) Ice’s for true literals 

^(reach{a,e)) = {{arc{a, e), reach{b, e), reach{c, e)}} 
^{reach{b, e)) = {{arc{b, e), reach{a, e)}} 
darc{a,e)) = {{}} 

(b) Ice’s for false literals 

Fig. 2. A fragment of Ice’s for the example in Figure 1 



We write ^(-p)(A) as whenever the program P is clear from the context. 
Observe that, for an atom A, the different sets in the collection ^(A) represent 
different consistent explanations for the truth or falsehood of A. An answer A 
can be explained in terms of answers {-Bi, B 2 , ■ ■ ■ , Bk} in ^(A) and then (recur- 
sively) explaining each Bp e.g. ^{reach{a, d)) in Figure 2 has a set with elements 
arc(a,c) and reach{c,d), meaning that the truth value (true) of reach{a,d) can 
be explained using the explanations of arc{a,c) and reach{c,d). Such explana- 
tions can be captured by a graph as shown in Figure 1(c). The edges denote 
locally consistent explanations. We do not use cyclic explanations to justify a 
true literal. In contrast, cyclic explanations describe infinite proof paths and can 
be used to justify a false literal. Instead of explicitly representing these cycles, 
however, we choose to keep the justification as an acyclic graph, breaking each 
cycle by redirecting at least one edge to a special node marked as ancestor. 
Formally: 

Definition 3 (Justification) A justification for an atom A with respect to pro- 
gram P, denoted by J{A,P), is a directed acyclic graph G = (V,E) with vertex 
labels chosen from HB{P) U {fact, fail, ancestor} such that: 

1. G is rooted at A, and is connected 

2. (_Bi,fact)eB 4 =^ {} € ^{Bi) At{Bi) = true 

3 . (_Bi,fail)eB 4=^ {} G ^{Bi) At{Bi) = false 

4 . (Bi, ancestor) G B t{Bi) = false A £^{Bi) = {L} 

A 3 B2 G L s.t. {B2,Bi)gE* V B2 = Bi 

5 . (-81,52) GB a t{Bi) = false 

^(81) = {8} A 82 G 8 A (82,81)^8* A 82 A 81 

6. (8i,82)gB a t{Bi) = true 

38 Ge(Bi) s.t. 82G8 A {V8 'g8 ( 8 ', 8 i) 0 B* A 8' / Bi} 

7 . 81 G F A t(Bi) = true => 3 unique 8 G C(Bi) s.t. 

V82 G 8 (81,82) G B A (82,8i)0B* A 82 / 81 

Rule 1 ensures that A is the root of justification. Rules 2 and 3 are the 
conditions for adding leaf nodes based on facts. Rules 4 and 5 specifies conditions 
for justifying false literals, while Rules 6 and 7 deal with true literals. 
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We will denote the justification graph built for a true (false) literal as true- 
justification (false- justification). 

e.g.. the true-justification in Figure 1(c) is built as follows: reach{a,d) is the 
root (by rule 1). Now consider the Ice {arc{a,c),reach(c,d)} in f{reach{a,d)). 
Since every element in this Ice does not form a cyclic explanation, and is different 
from reach{a,d), both edges (reach{a,d), arc(a,c)) and (reach{a,d),reach{c,d)) 
are added to the justification (by Rule 6). Rule 7 guantees that one and only 
one Ice is added into the justification. Next we construct true-justifications for 
arc(a,c) and reach{c,d) recursively. 



3 Conservative Justifier 

We review our algorithm in [8] to construct the justification graph. Its high-level 
aspects are skecthed in Figure 3. V denotes the vertices (labelled by literals in 
the ^’s) and E denotes the edges in this graph. 

Given a literal A the algorithm builds the graph recursively, traversing it 
depth- first even as it is constructed. At any point, V is the set of “visited” 
vertices, and Done is the set of vertices whose descendents have been completely 
explored. V — Done contains exactly those vertices that are ancestors to the 
current vertex A. 



algorithm Justify {A : atom) 

(* Global: P : program, E): Justification, Done C *) 
if {A 0 V) then (* A has not yet been justified *) 
set V V U {A} 

if (t(A) = true) then (* true-justification *) (1) 

let OiA ^ ^(-^) such that {ctA H V) C Done (2) 
if {cx.A — {}) then 

set E E \J (A, fact) 
else 

for each B G O' a do 

set E := E U {A, Justify (B)) 
else (* false-justification *) 

let {oa} = ?(A) (3) 

if {o:a — {}) then 

set E E U {A, fail) 
else 

if {{cxA n V^) ^ Done) then 
set E := E U {A, ancestor) 
for each B G — {V — Done)) do 
set E E \J {A, Justify{B)) 
set Done := Done U {A} 



Fig. 3. Justification Algorithm 



The algorithm is structured as follows: it takes the literal A whose truth 
is to be justified as the input parameter. It will determine a locally consistent 
explanation for either a true-justification in case t{A) = true (line 2) or a false- 
justification otherwise (line 3). Finally it justifies the literals in the explaination 
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set recursively. The selection of the justification is done by backtracking through 
let. Correctness of Justify appears in [8]. 

Algorithm Justify in [8] had assumes that all the predicates in the program 
are atbled. Let us analyse its behavior on “mixed” programs containing both 
tabled and non-tabled predicates. Observe that the algorithm computes the ex- 
planation set for A prior to building the justification graph rooted at A. Com- 
puting the explanation set corresponds to evaluating the truth values of literals 
in the set. Observe that this evaluation is done prior to justifying the truths of 
the literals in a a- This ensures that the justifications of the truths of literals in 
aA do not fail. In fact the only time a justification gets discarded is when there 
is a cycle in a true-justification. Algorithm Justify is the basis of a conservative 
justifier. 

3.1 Efficiency Issues in Conservative Justification 

Using the XSB tabled LP system we implemented Justify as a post-processing 
step following query evaluation. The advantage of using a tabled system for 
justification is that the answers in the tables can be directly used for computing 
the ^’s (lines 2 and 3). In particular if all the predicates are tabled then the truth 
value of all the literals are stored in the tables. Hence selecting a ^ (A) amounts 
to a simple table lookup. In fact we can show: 

Proposition 1 For a logic program consisting of tabled predicates only, the run- 
ning time of Justify is proportional to the time taken by initial query evaluation. 

Let us examine the behavior of Justify on a program containing both tabled 
and non-tabled predicates. In a tabled LP system there is no provision for storing 
the truth value of non-tabled literals. Consequently computing ^’s can become 
expensive since non-tabled predicates must be re-executed (a-la Prolog style) to 
ascertain their truth values. In fact, as is shown below, the time for justifying a 
single non-tabled literal can become quadratic its original evaluation time.. 

Example 2 Consider the following non-tabled factorial logic program: 

fac(0, 1). 

fac(N, S) N > 0, N1 is N - 1, fac(Nl, SI), S is SI * N. 

Assume that fac(N,S) is evaluted for some fixed n. It is easy to see that 
evaluation time is 0{n). The call to Justify (iac(n,n\)) will first compute 
^(fac(n,n! )). This set will include fac(n-l, (n-1) ! ). Algorithm Justify takes 
0{n — 1) steps to compute ^(fac(n,n!)) since evaluting the truth value of 
fac(n-l , (n-1) ! ) requires that many steps. Next Justify (faLC (n-1 , (n-1) ! ) ) is 
invoked and the above process is repeated. It is easy to see that Justify (n,n \ ) 
will require 0{n^) time. 

One can however table all the predicates in a program. In such a case the 
truths of fac(n,n! ) , fac (n-1 , (n-1) !),... , q(0,l) are all stored in an an- 
swer table upon completion of query evaluation. Justifiation will require 0{n) 
time since evaluating the truths of each of the fac’s can be done in 0(1) time. 
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But for time and space efficiency predicates are selectively tabled in practice [ 3 ] . 
The interesting question now is this: Can we design an efficient justifier for mixed 
programs without having to suffer the overheads of repeated re-execution of non- 
tabled predicates? Indeed our interest in this question was mainly motivated by 
our expereince with our XMC justifer for model checking [ 1 ]. On large model 
checking problems the XMC model checker took a few minutes to produce the 
results whereas the justifier failed to produce the justification even after sevaral 
hours! In the next section we will present an answer to this question. 

4 Speculative Justifier 

The idea behind a speculative justifer is as follows: Suppose we wish to justify the 
truth of p and further suppose there is a clause p :- qi,q2, qn in the program. 
Further suppose we wish to build a true-justification for p. If {qi,q2, •■•,<?«} G 
^(p) then one can build a justification for p by building true-justifications for each 
of the <7i’s, (1 < z < n). Without evaluating their truths apriori we speculate that 
{qi, q2, ■ ■ . , q-n} G C(p) &nd attempt to build a true-justification for all of them. 
If {qi,q2, . . . ,qn} G ^(p) then all these justifications will succeed and result in a 
true-justification for p. If {gi, (72, • • • , <Zn} ^ C(p) then there must exist at least 
one qi for which the attempt at building a true-justification for it will fail. Hence 
this clause cannot provide any evidence as to why p is true and we proceed to 
find another candidate clause. Now suppose we wish to build a false-justification 
for p. We speculate again that there must exist at least one qt that is false. So 
we attempt building a false-justification for each of the qfs in sequence. If we 
fail to build a false-justification for any of the qfs then we can conclude that 
a false-justification for p does not exist. On the other hand if we do succeed 
then we repeat this process on the next clause that unifies with p. Recall from 
definition of justification that to justify that p is false there must exist a false 
literal in each of these clauses. 

The main advantage of speculative justifiers can be seen when justifying non- 
tabled predicates. Recall non-tabled literals are re-executed during justification. 
Speculative justifiers re-execute less often than their conservative counterparts. 
Consider {qi qi-i.\l < z < n}, n is a constant and go is a fact. To build a 
true-justification for q„ the speculative justifier will attempt to build a true- 
justification for qn-i which in turn build a true-justification for qn-2, and so on. 
All of these justifications succeed without ever having to repeat re-execution of 
any of the qfs in q^s proof. 

4.1 Efficiency Issues 

But speculative justifiers can suffer from inefficiencies. For example, the gains 
accrued by re-executing non-tabled literals less often can be easily offset by 
wasted justifications. We discuss these problems below: 
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— The Problem of Wasteful Justifications: 

Naive implementation of speculative justifiers can result in building wasteful 
justifications that are eventually discarded. For example, suppose we wish to 
build a true-justification for p using the clause pick p : — q,r. Suppose q is 
true and r is false. We will succeed in building a true-justificaton for q but 
fail to do so for r. So using this clause we will fail to build a true-justification 
for p. But the true-justification built for q is wasted. 

— The Problem of Rebuilding Justifications: 

In the above example justification of q was discarded as being irrelevant for 
justifying p. Now suppose later on we encounter the literal q again during 
justification. If we do not save the justifcation of q then we will have rebuild 
its justification all over again. 

We now propose solutions to these two main sources of inefficiency in a 
speculative justifier. 

Lazy Justification To avoid wasteful justifications we justify tabled literals lazily. 
The idea is this: Let us suppose we select the clause p : — qi,q 2 , ...,qn for 
justifying p. Assume we wish to build a true-justification for p. Suppose the 
literal currently on hand, say qi, is tabled. Then we do a simple table-lookup 
to verify that T{qi) is true. If this is the case we defer building its justification 
and move on to the next literal in the sequence. If qi is non-tabled then we 
build true-justification for it. We proceed to build justifications for the tabled 
literals in the clause only after we succeed building true-justifications for all of 
its non-tabled literals. This idea carries over for false-justifications also. 

Sharing Justfications The solution to re-building justifications is to save all 
of them after they are built the first time. We save the justifications of both 
tabled and non-tabled literals. But this can result in space ineffciencies especially 
if sharing is infrequent and irrelevant justifications outweigh relevant ones. A 
practical compromise between never re-building and always re-building is to 
share the justifications of tabled literals only. But note that justification of a 
tabled literal might involve other tabled literals. So we will have to avoid copying 
the entire justification. Instead we save a “skeleton” of the justification from 
which we can reproduce the complete justification. We call this skeleton partial 
justification. Intuitively the leaf nodes of a partial justification are either labelled 
fail, fact, ancestor or by a tabled literal. All of the interior nodes except 
the root are labelled by non-tabled literals. Formally: 

Definition 4 (Partial Justification) A partial justification for an atom A 
with respect to a program P and table T, denoted by V{p^t){A), is a directed 
acyclic graph G = (V,E) with vertex labels chosen from HB{P) U {fact, fail, 
ancestor} and the edges from B 2 )\Bi = A V ^ Tj. The conditions for 
selecting the edges are the same as those used in defining justification (def. 3). 

We drop the parameter P and T and write the partial justification as P{A) 
whenever the program and the table are obvious from the context. 
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reach (a, d) 




arc (a, c) 



reach (c, d) 



reach (c, d) 
arc (c, d) 



fact 



Fig. 4. Partial Justification of reach{a, d) and reach{c,d) in Figure 1 



Figure 4 denotes the partial justifications of reach{a,d) and reach{c,d) for 
the example in Figure 1. 

We can compose partial justifications together to yield a complete justifica- 
tion for a literal. Informally composition amounts to “stringing” together the 
partial justifications of tabled literals at the leaf nodes labelled by those liter- 
als. For example in Figure 4, by attaching the partial justification of reach{c, d) 
to the leaf node labelled reach{c, d) in the partial justification of reach{a, d) 
yields its complete justification. However care must be exercised when com- 
posing partial justifications. In particular compositions that produce cycles in 
true-justifications must be discarded. 

4.2 Algorithmic Aspects of Speculative Justification 

The speculative justifier builds a justifcation by composing several partial justi- 
fications. The algorithm for partial justification is shown in Figure 5. It takes the 
following parameters as its input: (i) A which is the literal to be justified, (ii) A’s 
truth value Tval and (iii) Anc which is a list of tabled literals that are ancestors 
of A in the justification. The algorithm builds a true(false)-justification if Tval 
is true (false). It returns in J the partial justification of A and D those tabled 
calls which appears in the leaf nodes of J. We use clause ( A, B) to pick a clause 
that unifies with A and findall for aggregation. T denotes the tabled literals 
and their answers. 

Recall that to build the complete justification of A we need to know the 
partial justifications of all the tabled literals that the justification of A depends 
upon (e.g. reach(a,d) depends on reach(c,d) in Figure 4). Let Da = {P\P is 
a tabled literal that appears as the label of a leaf node in P{A) }. We refer it 
to as the dependent set. We will drop the subscript from the notation for the 
dependent set if the literal that it is associated with is clear from the context. 

4.3 Properties of a Speculative Justifier 

We will suppose that a speculative justifier is based on algorithm partial- justify 
and that the complete justification for any literal is obtained by composing all 
the partial justifications of tabled literals it depends on. We state below some of 
its important properties. 

Proposition 2 On purely tabled logic programs, speculative justifier coincides 
with conservative justifier. 
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The above is based on the observation that to justify A the speculative justifier 
generates a partial justification which includes its dependent set and fact nodes. 
They correspond to a Ice for A. 

Proposition 3 On purely non-tahled logic programs, justification time required 
by a speculative justifier is proportional to query evaluation time. 

This proposition is based on the observation that when a program has no 
tabled predicates then the partial justification for A corresponds to complete 
justification and that evaluation proceeds in Prolog-style. 

Theorem 4 The time taken by a speculative justifier for justification is no more 
than the time taken by a conservative justifier 

We sketch only the main observation for establishing the above propoerty. 
Note that a conservative justifier computes a Ice for A by re-executing non-tabled 



algorithm Partial- Justify {A : atom, Tval : truth value, Anc : Ancestors) 
(* Local: J : Justification (V, £'); D : Dependent Set *) 
set (J.D) := (({A}. {}),{}) 

if ( Tval = true ) then (* build true-justification *) 
clause{A, B) 

if { B — true ) then(* the selected clause is a fact *) 

set J := {{A, fact} , {{A, fact)}) 

else 

for each G G B then 

if ( G G T ) then 

if ( t(G) — true ) then 
if (G G Anc) then 
fail 
else 

set E EU {(A, G)} 

set D D U {G} (* add G to the dependent set *) 
else (* r(G) ^ true *) 
fail 

else (* G is a non-tabled call *) 

set E EU {(A,G)} 

set ( J, D) ( J, D) U partial- justify {G Tval, Anc) 

else (* build false justification *) 
findall{B, clause(A, B), BL) 
if {BL — {}) then (* no clause unifies with A *) 
set J := {{A, fail}, {< A, fail >}) 
else 

for each B G BL do 

let G G S (* G is choosen from B sequentially *) 
if ( G G T ) then 

if ( t(G) — false ) then 
if (G G Anc) then 

set E E U {(A, ancestor)} 
else 

set E EU {(A, G)} 

set D D U {G} (* add G to the dependent set *) 
else (* r(G) / Tval *) 
fail 

else (* G is a non-tabled call *) 

set E EU {(A,G)} 

set {J, D) {J, D) U partial- justify {G , Tval, Anc) 
return {J, D) 



Fig. 5. Speculative Justification 
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literals and consulting the answer table for tabled literals. This coresponds to 
computing the partial justification of ^ by a speculative justifier. Besides the 
search paths for computing Ice’s in a conservative justifier and partial justifica- 
tions in a speculative justifiier also correspond. 

While the above theorem only says that the time taken is proportional, spec- 
ulative justifiers can do better. Consider the non-tabled factorail program in Ex- 
ample 2. By avoiding repeated re-executions the speculative justifier will build 
a true justification for (fac(n,n! ))in 0(n) steps whereas it took O(n^) steps 
for the conservative justifier. 

5 Experimental Results 

In [8] we reported on the performance of a conservative justifier based on Justify 
(in Section 3) and implemented using the XSB tabled LP system. It was de- 
veloped for our XMC model checking environment Model checking in XMC 
corresponds to evaluating a top-level query that denotes the temporal prop- 
erty of interest. The query succeeds whenever the system being verifed satisfies 
the property. To succinctly explain the success or failure of the query we use 
the XMC justifier. We have now implmented the speculative justifier based on 
Partial- Justify (see Section 4). This impelmentation also uses the XSB system. 
Both the impelmentations only share the justifications of tabled literals. 

We compare the performance of both the justfiers on the model checking 
application using our XMC system. Figure 6(a) compares their running times 
while Figure 6(b) shows their space usage. The model checking examples used in 
these experiments ((i-Protocol, ABP, Leader, Sieve) were taken from the XMC 
collection. i-Protocol is a sliding window protocol in the GNU UUCP stack, 
ABP is the alternating protocol. Leader and Sieve are taken from the SPIN [2] 
example suite. 

Observe that the running times of the speculative justifier is significntly bet- 
ter, sometimes by several orders of magnitude. Because of its significant speedups 
the speculative justfier is able to scale up to large problem sizes. For example, on 
i-Protocol(window size 1, no livelock) and Leader(size 6), which are instances 
of large model checking examples, the speculative justifier took a few minutes 
whereas the conservative justifier did not finish even after several hours! 

Also observe that the space usage of the speculative justifier appears compa- 
rable to its conservative counterpart. 

Figure 7(a) compares justification time of the speculative justifier to query 
evauation time while Figure 7(b) compares their space usage. Observe that the 
running times and space usage of the speculative justifier seems to suggest that 
they are both nearly proportional to those of query evaluation. 

6 Discussion 

We introduced the concept of a sepeculative justifier, presented an algoritihm 
for it and provdided experimental evidence of its efficiency and scalabity. The 
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justification algorithm in this paper assumed definite clause logic programs. In 
[8] we show how to extend the justification algorithm in a conservative justifier 
to normal logic programs evaluated under well-founded semantics. The same ex- 
tensions carry over to the justifcation algorithms used in the speculative justifier. 

In this paper our primary focus was on improving the running time of justi- 
fication so as to scale to large problem sizes that we encountered in our model 
checking application. The justifier described in this paper can be used with any 
other tabled LP system. As far as space usage is concerned it is possible to 
improve it further. One possibility is to control the size of partial justification. 
Recall that partial justification can include justification of non-tabled literals. 
There are several reasons for controlling the justification of non-tabled literals 
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and thereby control the size of partial justificaion. Firstly, justification of non- 
tabled literals can be arbitrarily big. Secondly, users may not be interested in 
justifying non-tabled calls. Thirdly users may prefer to use the familiar 4-port 
debugger for non-tabled literals over a justifier. Users can specify the non-tabled 
literals that they are not interested in justifying. The justifier will simply evaluate 
away such literals without explicitly building a justification for them. Improving 
space efficiency using such techniques is a topic that deserves further exploration. 
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Abstract. Bisimulation is a fundamental notion that characterizes be- 
havioral equivalence of concurrent systems. In this paper, we study the 
problem of encoding efficient bisimulation checkers for finite- as well as 
infinite-state systems as logic programs. We begin with a straightforward 
and short (less than 10 lines) encoding of finite-state bisimulation checker 
as a tabled logic program. In a goal-directed system like XSB, this encod- 
ing yields a local bisimulation checker: one where state space exploration 
is done only until a dissimilarity is revealed. More importantly, the logic 
programming formulation of local bisimulation can be extended to do 
symbolic bisimulation for checking the equivalence of infinite-state con- 
current systems represented by symbolic transition systems. We show how 
the two variants of symbolic bisimulation (late and early equivalences) 
can be formulated as a tabled constraint logic program in a way that 
precisely brings out their differences. Finally, we show that our symbolic 
bisimulation checker actually outperforms non-symbolic checkers even 
for relatively small finite-state systems. 



1 Introduction 

A tabled logic programming system offers an attractive platform for encoding 
computational problems in the specification and verification of systems. The 
XMC system [12] casts the problem of model checking — verifying whether a 
given concurrent system is in the model of a temporal logic formula — as query 
evaluation over an “equivalent” logic program [11]. This formulation is based on 
the connection between models of logic programs and models of temporal logics. 
In this paper, we consider the related problem of bisimulation checking which 
checks for equivalence of two system descriptions. 
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Bisimulation checking is a problem of fundamental importance in verifica- 
tion. Many verification systems such as the Concurrency Workbench of the New 
Century (CWB-NC) [3] and CADP [1] incorporate bisimulation checkers in their 
tool sets. Informally, a pair of automata M, M' are said to be bisimilar if for 
every transition in M there exists a corresponding transition in M' and vice 
versa. There has been a lot of research on efficient algorithms for bisimulation 
checking. But the focus of this vast body of work has been on finite-state sys- 
tems, i.e., one assumes that M and M' are both finite state. Hennessy and Lin 
were the first to consider the problem of bisimilarity checking of infinite-state 
systems in the setting of value-passing languages [4] . This initial work has been 
recently expanded [7,6]. Nevertheless research on this problem remains in a state 
of infancy. 

In this paper, we explore the use of logic programming for the above problem. 
We begin with a direct formulation of strong- and weak-bisimulation checking for 
finite-state systems (see Section 2). We show that, using query evaluation with 
a tabled logic programming system, this encoding yields a local bisimulation 
checker: one where the state space of the concurrent systems is explored only 
until the first evidence of non-bisimilarity is found. Note that when the systems 
are indeed bisimilar, the local checker explores the entire (reachable) state space. 
Even in this case, our bisimulation checker encoded in XSB logic programming 
system [13] shows performance comparable to the global bisimulation checker in 
CWB-NC. For systems that are non-bisimilar, the local checker outperforms the 
global checker by several orders of magnitude. 

We introduce symbolic transition systems (STSs) which can finitely repre- 
sent infinite-state systems (see Section 3) . STSs are more general than Symbolic 
Transition Graphs (STGs) and STGs with Assignments (STGAs) used in [4] 
and [7] respectively. We formulate symbolic bisimulation algorithms for check- 
ing two kinds of equivalences widely studied in the literature — late- and early- 
equivalences — as tabled constraint logic programs (see Section 4). Similar to 
the finite-state case, our formulation is a direct encoding of the definition of the 
bisimulation relations themselves. We describe how the programs can be evalu- 
ated using a constraint meta-interpreter implemented in XSB. Our experimental 
results show that symbolic bisimulation is practical for realistic systems. Surpris- 
ingly, our results show how even for relatively small finite-state systems, it may 
be better to perform symbolic bisimulation on its infinite-state counterparts. We 
conclude in Section 5 with a short discussion of the implications of this work. 



2 Bisimilarity of Finite-State Systems 

Labeled transition systems (LTSs) are widely used to capture the operational 
behavior of concurrent systems. An LTS is denoted by L = (S', Act, — >■), where 
S is a finite set of states. Act is a finite set of actions (transition labels), and 
— >-C S X Act X S is a transition relation. Transition from state s to t on an 
action a is represented by s — > t. Example LTSs are given in Figure 1. 
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An LTS L = {S,Act, — >) is encoded as a set of facts in a logic program P 
such that whenever s t, then trans (s , a, t) is in P. Note that since s,t & S 
as well as a G Act are from a finite set, they can be represented in a logic program 
by ground terms. Actions on transitions in LTS are of two types: actions that 
may be effected by external entity, environment, are called observable actions and 
actions that are the result of synchronization of subsystems are called internal 
actions or r actions. Based on this notion of observability there are two variations 
of bisimilarity, strong and weak, described below. 

2.1 Strong Bisimulation 

Strong bisimulation does not differentiate between internal and observable ac- 
tions. 

Definition 1 (Bisimilarity Relation) Given an LTS L = {S,Act, — TZ is 
a bisimilarity relation over L if 

Vsi,S 2 G S. SiTZs 2 ^ (V(si ti). (3(s2 t2). tiTZt2) A S 2 PS 1 ) 

Two states in a system are equivalent with respect to bisimulation if they are 
related by largest bisimilarity relation TZ. Two LTSs can be compared for bisim- 
ilarity by computing bisimulation of their disjoint union. For instance, consider 
the LTSs in Figure 1. States pn and qn are not bisimilar as there exists a tran- 
sition from pii with action g for which there is no matching transition from qn. 
As such, states pi and qi are not bisimilar and also the states po and qo are not 
bisimilar. 

Encoding Strong Bisimulation. Using the dual of Definition 1, we can say 
that two states in a system are not equivalent with respect to bisimulation if 
they are related by the least relation TZ defined as follows: 

Vsi, S 2 G S. SiTZS2 (3(si ti). (V(S2 ^ 2 ) ^ tiTZt 2 ) V S2TZS1) (1) 

Note that TZ can be encoded as a logic program by exploiting the least model 
computation of logic program as follows: 
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bisim(Sl, S2) tnot (nbisim(Sl , S2)). 

nbisim(Sl, S2) trans(Sl, A, Tl), 

no_matching_trans (S2 , A, Tl) . 
nbisim(Sl, S2) nbisim(S2, SI). 

In the above encoding, nbisim/2 stands for TZ defined by Equation 1. The goal 
no_matching_trans(S2, A, Tl) stands for V(s 2 ^ 2 ) tiTZt 2 and is in 
turn defined as: 

no_matching_trans (S2 , A, Tl) 

forall(trans(S2, A, T2) , nbisim(Tl, T2)). 

"/o Tl is not bisimilar to any T2 

Note that since the terms S2, A are ground and T2 is free, forall/2 can be 
encoded without considering any free variables as follows: 

foralKP, Q) findalKq, P, L), all(L) . 
all([]). 

alKlQlQs]) Q, all(Qs) . 

2.2 Local Bisimulation 

Evaluating bisimCsi, S 2 ) using tabled resolution, we can prove or disprove 
bisimilarity of states Si and S 2 - Note that goal directed computation with tabling 
makes the bisimulation checker “local” : state space exploration is done only until 
the proof for bisimilarity or non-bisimilarity is obtained. However, if the given 
states are actually bisimilar, then we explore all the states reachable from si and 
S 2 - Another important aspect of our encoding is that it can be directly extended 
to symbolic bisimulation checking for infinite-state systems. 

Given any two states in an LTS (S,Act, — >■), the worst case time complex- 
ity of our bisimulation checker is 0(|5'| x | — > |) assuming unit-time table 
lookups. The quadratic factor in our encoding comes from checking for bisim- 
ulation between (potentially) every pair of states. Table lookups may add |S'p 
factor to the complexity if tables are organized as a list, or |log(S')| factor if 
binary tree data structures are used. It should be noted that there are faster 
bisimulation checking algorithms: the Kanellakis-Smolka algorithm [5] runs in 
Od^l X I — ^ I); Paige and Tarjan’s algorithm [9], implemented in CWB-NC, 
runs in 0(| — > \ x log lAj). These algorithms, unlike our implementation, com- 
pute equivalence classes of bisimilar states bottom up and are thus global. 

2.3 Weak Bisimulation 

In practical settings two systems are considered to be equivalent when they are 
identical with respect to the observable actions. Weak bisimulation or obser- 
vational equivalence formalizes this notion. It is defined on the basis of weak 
transition relation. 

Definition 2 (Weak Transition Relation) Given a LTS L = {S,Act , — >^), 
weak transition relation, — >-u,C S x Act x S, is the smallest relation such that 
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1. T and si(— ^)*S2 S3(— ^)*^i 

2 . si(^)*<i ^ 

Note that (2) implies that for every state s, s— 

Definition 3 (Weak Bisimilarity) Given an LTS L = {S,Act, — ^), TZw is 
a weak bisimilarity relation over L, if 

Vsi, S2 G S. SiTZ-wS2 (V(si — ^ tl). (3(s2“^«|t2)- tiTZwt2) A 527^1^51) 

Encoding Weak Bisimulation. We begin with the encoding the weak transi- 
tion relation. Note that (— ^)* is the transitive closure of — ^ and can be encoded 
as follows: 

taustarCSl, SI). 

taustarCSl, S2) taustarCSl, T) , transCT, tau, S2) . 

Using taustar/2 weak transition relation can be directly encoded as follows: 
weak_trans (SI , tau, Tl) taustarCSl, Tl) . 

weak_trans (SI , A, Tl) taustarCSl, S2) , trans(S2, A, S3), 

A \= tau, taustar(S3, Tl) . 

Note that the only difference between Definitions 3 and 1 lies in the selection 
of matching transition, i.e., 3(s2 — >wt 2 )-tiR-wt 2 - Definition 1 uses strong tran- 
sition — whereas Definition 3 uses weak transition Thus the previous 

encoding of strong bisimilarity can be changed as follows to compute weak bisim- 
ilarity: 

weak_bisim(Sl , S2) tnot (nweak_bisim(Sl , S2)). 

nweak_bisim(Sl , S2) transCSl, A, Tl) , no_matching_trans (S2 , A, Tl) . 
nweak_bisim(Sl , S2) nweak_bisim(S2, SI). 

no_matching_trans (S2 , A, Tl) 

forall(weak_trans(S2, A, T2) , nweak_bisim(Tl , T2)). 



2.4 Experimental Results 

Below, we compare the performance of our local bisimulation checker with the 
one based on partition refinement algorithm [9] implemented in CWB-NC. Ex- 
ample systems selected are families of stack{b,d), queue{b,d) and 
multi Jink -buf fer{b, d) where d denotes the data domain size and b denotes 
buffer lengths in the first two systems and number of buffers in the third sys- 
tem. All measurements were made on a Sun 4U spare Ultra Enterprise with 2G 
memory running Solaris 5.2.6, using XSB v2.3 and CWB-NC vl.ll. 

A stack{b, d) is defined with a buffer of fixed size b, where insert and delete 
actions respectively add and delete data to and from the top of the buffer. 
Whereas, in case of queue{b, d) insert action adds data to the bottom of the buffer 
and delete action removes data from the top. A multi Jink J)uf fer{b, d) is a chain 
of b buffers (each of length 1), where each buffer can insert and delete data. 
However, visible actions include only insertion to the first buffer and deletion 
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Fig. 2. a. stack{2,2) h.multilinkjbuf fer{2,2) 



from the last buffer. All other insert and delete actions are synchronized actions; 
delete from the tth buffer synchronized with the insert action to the i + 1th 
buffer (1 < i < 6). These actions are represented as r transitions. Domain of 
each element in the buffer ranges over d distinct values. 

Figure 2(a) shows the transition system of a stack{b, d) with 5=2 and d = 2, 
where insertll and delete! 1 represents input and output actions that insert and 
delete data value 1 to and from the stack. Similarly, Figure 2(b) shows the 
transition system of a multi Jink J>u f fer{b,d) with 5 = 2 and d = 2. 





Fig. 3. a. Bisimulation of two identical stack{b, d)s in XSB and CWB-NC b.Weak 
Bisimulation of queue{b,d)s and multi Jink Jmf fer{b, d) in XSB and CWB-NC 



Figure 3(a) shows the time taken (on logarithmic scale) to check for strong 
bisimilarity of two identical stack{b,d)s for different combinations of 5 and d. 
The number of transitions in the system is 0{d^). Since the systems are bisimilar, 
both our encoding and the CWB-NC explore the entire state space. As shown 
in Figure 3(a), XSB implementation is roughly 3 times faster than CWB-NC 
implementation and hence comparable. Similar results are obtained when we 
perform weak bisimilarity checking of queue{b, d) and multi Jink Jmf fer{b, d) 
(Figure 3(b)). 

We now compare the time taken to check strong bisimilarity of stack{b, d) and 
queue{b, d) for different combinations of 5 and d. These systems are not bisimilar 
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when b > 2 and d > 2. In this case local bisimulation checker implemented 
in XSB outperforms global checking algorithm implemented in CWB-NC. The 
XSB implementation can check for bisimilarity of stack{b,d) and queue{b,d) for 
b = 5000, d > 2 in 41.70 secs, and with CWB-NC the largest system we can 
check is of & = 7, d = 4 which takes up about 40.50 secs. We also performed weak 
bisimulation checking between stack{b,d) and multiJinkJbuffer{b,d). In this 
case the time taken for XSB implementation to detect non-bisimilarity between 
two systems with b = 100 and d > 2 is 20.91 secs, while the largest system 
CWB-NC can check for non-bisimilarity is of 6 = 7 and d = 3 in about 79.45 
secs. It is worth mentioning here that local bisimulation checking based on XSB 
in both the cases is independent of d. An inspection of query evaluation in 
XSB reveals that only two elements from the data domain d are sufficient to 
prove non-bisimilarity of the given systems. However the same is not the case 
for CWB-NC as it uses global bisimilarity checking algorithm. 

3 Infinite-State Systems 

We introduce the notion of Symbolic Transition Systems (STSs) as a way to 
represent large or infinite-state systems. An STS can be viewed as an LTS aug- 
mented with state variables, guards on transitions, and nonground terms as 
action labels. Infinite-state systems can be represented finitely by STSs. 

3.1 Symbolic Transition Systems 

We assume the standard notion of terms, substitutions and unifiers. We use V 
to denote an enumerable set of variables, T to denote a set of function symbols, 
V to denote a set of predicate symbols, and B to denote {true, false}. Function 
and predicate symbols have fixed arity; function symbols of arity 0 are called 
constants. Expressions, denoted by £ are terms over iF U V and guards, denoted 
by 7, are terms over FUd^UV where predicate symbols appear at (and only at) 
the root. The set of variables in a term t is denoted by vars{t). Substitutions, 
denoted by 0 and a (possibly primed or subscripted), are mappings from V to 
£. A substitution that maps value v to variable x is written as [v/a;]. A term 
t under substitution a is denoted by ta; the composition of two substitutions 
CTi, (72 is denoted simply by aia 2 - 

A guard 7 of arity n is interpreted as a mapping from F" to B. Alternatively, 
7 can be viewed as a set of substitutions such that cr G 7 iff 7(7 = trae. An 
action is a term in one of the following forms: 

— Input Action: Represented as clx, where c is a constant and a; is a variable. 

— Output Action: Represented as c!e, where c is a constant and e is an expres- 
sion. 

— Internal Action: Represented by r, a constant. 

Output actions without the expression parameter, as in c!, are also known signals, 
and are simply represented as c. 
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Definition 4 (Symbolic Transition System) A symbolic transition system 
is a finite labeled directed graph {S, — >■), where S is a set of terms, called loca- 
tions, which form vertices of the graph, and — > is the edge relation where each 
edge s ^4- t is labeled with: 

— an action a such that 

• vars{a) C vars(s) if a is not an input action. 

• vars{a) fl vars(s) = {} if a is an input action. 

— a guard 7 such that vars(j) C vars(s), and 

— a transfer relation p that relates vars{s) to varsft). 

If a guard is true it is omitted. The transfer relation is used to model updates to 
variables. The transfer relation is omitted whenever it is the identity mapping 
over the source and target variables. The edge relation of two STSs representing 
a stack and a queue that stores arbitrary data values, with maximum buffer 
length of 2, are shown in Figure 4 (a) and (b) respectively. 



So 
Si (a;) 
si(a;) 
S2{x,y) 



insert?x 

deletelx 



Si (a;) 
So 



insertly 

deletely 



S2(x,y) 

si(x) 



90 

91 {x) 

91 (x) 

<i2{x,y) 



in sert 7x 

delete\x 

— !■ 90 

insert?y 



9i(*) 

<i2{x,y) 

9i(y) 



Fig. 4. Example STS representing 2-length stack (a) and 2-length queue (b) over ar- 
bitrary data domain 



Note that the definition of STSs is general enough to capture Symbolic 
Transition Graphs (STGs) [4] and STGs with Assignments (STGAs) [7]. For 
instance, STGs are STSs where each edge s t is such that vars{t) C 

(vars(s) U vars(a)). 

3.2 Semantics of STS 

Semantics of an STS S is given in terms of a transition relation, denoted by 
T(5), which is generated by interpreting S with respect to substitutions. Given 
an STS S, each state in T(5) is a location s paired with a substitution a on 
vars{s). There are different variants of semantics depending on how variables 
are interpreted. In the following, we describe late and early semantics which are 
the most widely studied to date. 

Late semantics is a natural interpretation of the symbolic transition systems, 
by “reading off” transitions from a state sa by applying the substitution on 
all components of edges of the form s LLLf i from location s. This is captured 
formally by the following definition. 

Definition 5 (Late Transition Relation) Let a be a substitution such that 
s LAfi f g ^ gafigfigg p. Then T(5) contains sa^^^ita. 
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One interesting aspect of late semantics is that we only capture substitutions 
on variables in the target state of a transition if they are related by p to those 
in the start state. For instance, consider an input transition of the form s — >■ t. 
From definition of STS, x ^ vars{s). If t contains x, then x does not immediately 
pick up a value due to this transition. The variable x is left to be bound by a 
guard or transfer relation in a subsequent state. Early semantics interprets the 
new variables introduced on input actions by immediately assigning values to 
them. 

Definition 6 (Early Transition Relation) Let a be a substitution such that 
s lELf f g ^ u satisfies p. Then T{S) contains 

sa-^^eta if a is not an input action 

and further for all possible ground term v 

sa—^et<j[v/x] if a = clx, and v is a ground term 

The two semantics naturally yield two variants of the bisimulation relation, 
as described in Section 4 . Below, we describe how an STS can be encoded as a 
logic program so that the late semantics can be computed directly by resolution. 

Encoding STS as a Constraint Logic Program: The edge relation of an STS S 
can be encoded as a constraint logic program P such that for each s TdCf t G S 
sts_edge(s, a, 7, p, t) is a fact in P. We can encode each guard 7 as a 
predicate in P so that whether ct 7 can be checked using the query ycr. We 
can also encode the transfer relation p as a predicate in P. 

The late transition relation (Definition 5 ) can be computed from this set of 
facts using the following rule: 

late_trans(S, A, T) sts_edge(S, A, Gamma, Rho, T) , 

Gamma, ’/. The guard is satisfied 

Rho. ’/. and so is the transfer relation 

Early transition relation cannot be so directly encoded due to the universal 
quantifier over values in its definition (see Definition 6). 

4 Symbolic Bisimulation 

Late bisimulation and early bisimulation, which we describe in detail below, differ 
in the way input actions are treated. Consider checking the bisimilarity of loca- 
tions Pi and qi in the STSs given in Figure 5 . Clearly, locations piii,pn2,Pi2i 
are all bisimilar to locations gm, 9112, <Zi2i, 9122 (all are deadlocked). Further- 
more, location pn is bisimilar to qn when a; = 0, and is bisimilar to qi2 if x yf 0; 
location pi2 is bisimilar to qn when x yf 0; and is bisimilar to <712 when x = 0. 
However, are pi and qi bisimilar? 

When qi makes a transition, say qi gii, what is the matching transi- 
tion from pi? According to the bisimilarity sets we have computed so far, the 
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(a) (b) 

Fig. 5. Example symbolic transition systems 



matching transition is the one to pn when a: = 0 and the one to p\2 when x 0. 
These two transitions together cover the transition from qi to qn. However, note 
that the action on this transition is an input: c?x. Do we know enough about 
the value of x to make the choice between pn and pi2? According to early se- 
mantics, the value of x is known when a transition is taken. However, according 
to late semantics, the value of x is determined only by later guards, and hence 
unknown when the transition is taken. Hence, p\ and q\ are early-bisimilar but 
not late-bisimilar. 

Before a formal presentation of the bisimulation relations over STSs, we 
motivate their definitions by starting from the basic bisimulation relation for 
the finite-state case. In Definition 1, we had 



Vsi, S2 G S Si'R.S2 ^ (Vsi ti 




tiTZt2) A S27?.si) 



The question we have now is, having picked a transition si — > how do we 
pick the matching S2 — > ^2! and since in the symbolic case the action labels 
may bind variables, under what substitution. In the late bisimulation case, re- 
call that the variable in an input label is bound only afterward, and hence the 
matching transition S2 — > ^2 should be such that t\ and t2 are bisimilar un- 
der all substitutions to the input variable. In summary, the matching transition 
must be picked before considering substitutions. This intuition is captured by 
the following formal definition of late bisimulation. 



Definition 7 (Late Bisimulation) Given an STS {S , — >■), the late bisimula- 
tion relation with respect to substitution 6, denoted by TZf , is a subset of S x S 
such that 
S\Ti-i S2 =t 



3s2e^it29 



Va \{ai[ 9 a] = a 2 [ 0 (j]) ^2) As27?.fsi 



In the early bisimulation case, recall that the variable in an input action is 
bound at the transition itself, there is no choice to make in terms of substitutions. 
This is captured by the following formal definition of early bisimulation. 

Definition 8 (Early Bisimulation (using — >-g)) Given an STS (S', — 
the early bisimulation relation with respect to substitution 9, denoted by TZg, 
is a subset of S x S such that 

si7^gS2 (ys\9-^eti9 



(^l^e^2)) A 527^e5i 
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The above definition relies on the definition of the early transition relation. 
It turns out, however, that we can define early bisimulation completely in terms 
of the late transition relation [10], as follows: 



Definition 9 (Early Bisimulation (using — >i)) Given an STS (S', — 
the early bisimulation relation with respect to substitution 9, denoted by TZ^, 
is a subset of S x S such that 

Si7^fs2 ^ {ysi9~^iti9 Va {ai[9(j] = a2[9(j]) AtiTZ^/t2) A S27^fsi 



This alternative definition of early bisimulation is especially important to a logic- 
programming-based encoding since early transition relations are hard to encode 
as logic programs. 

4.1 Encoding Bisimulation Checkers as Logic Programs 

We can encode checkers for equivalence with respect to late as well as early 
bisimulation following the encoding of bisimulation checkers for the finite-state 
case. 

Early Bisimulation: Consider the complement of early bisimulation relation TZe, 
written as TZe'- 

Si7^fs2 ^ (3si9-^iti9 3ays29-^it29{ai[9a] = a2[0a]) ^ tiTZ^'^t2)V S2 TZ^si 

^ ( 2 ) 

Since bisimulation is the largest such relation, the complement is naturally 
the least relation that satisfies the above equation. This relation can be encoded 
as a constraint logic program as follows: 

nbisimCSl , S2) late_trans(Sl , Al ,T1) , no_matching_trans (SI , Al ,T1 , S2) . 

nbisim(Sl , S2) nbisim(S2,Sl) . 
no_matching_trans (SI , Al ,T1 ,S2) 

f orall ( (A2,T2) , late_trans(S2,A2,T2) , nsimulate(Al ,T1 , A2 ,T2) ) . 
nsimulate (Al ,T1 , A2 ,T2) similar _act (Al ,A2) , nbisim(Tl ,T2) 

; not_similar_act (Al , A2) . 

Several differences are apparent between the finite-state bisimulation checker 
in Section 2 and the one given above. The first and most obvious difference is the 
use of late_trans for trans. The second is the use of a ternary f orall predicate 
in order to explicitly differentiate between the bound and free variables. Note 
that in the finite-state case, there were no free variables in the universally quan- 
tified formula, and hence we could vastly simplify the way f orall was encoded. 
In the infinite-state case we need to find consistent values for all the free variables 
used in the universally quantified formula (Vs 2 ^~^;i 2 ^ ... in Equation 2). 

The third difference is the use of similar_act (and not_similar_act) 
to check for {ai[9a\ = a2[0a]) in Equation 2 (and its negation). Although 
similar_act (Al , A2) is A1=A2, not_similar_act(Al,A2) is not simply the 
negation of A1=A2, for the following reason. Two output actions c\x and c\y 
can be dissimilar as long as x and y can be bound to different values. Note that. 
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in contrast, since an input action creates a new bound variable, c?x and c7y 
are always similar. Hence, we have the following encoding for similar_act and 
not_similar_act: 

similar _act (At , A2) A1 = A2. 

not_similar_act (A1 , A2) A1 \= A2, 

( (A1 = in(C,_), (A2 = in(D,_), C\=D 

; A2 = out(_,_) 

; A2 = tau) 

; A1 = out(_,_) 

; A1 = tau) ) . 

Late bisimulation: Let us now consider TZi, the complement of late bisimilarity 
relation: 

siTZfs2 <= {3si0-^iti9 ys29-^it20 3a{ai[6a] = a2[0a]) Vs27^f si 

The essence of this equation is that in order to show non-bisimilarity, for every 
transition from S 2 to t 2 we should find a local a such that either the actions do not 
match, or ti and t 2 are non-bisimilar. This condition can be tested by simply 
ensuring that the different transitions from S 2 are standardized apart before 
checking for matching contexts. Standardization can be done via copy_term/2 
which generates a copy of a term with fresh variables. Late bisimulation can thus 
be derived from the encoding of early bisimulation by modifying nsimulate/5 
as follows: 

nsimulate (A1 , Tl, A2, T2) 

( similar _act (A1 ,A2) , 

change_environments(Al , (T1,T2), (U1,U2)), 
nbisimCUl, U2)) 

; not_similar_act (A1 , A2) . 

change_environments (in(_, _) , El, E2) copy_term(El , E2) . 

change_environments(out(_,_) , El, El). 

change_environments(tau, El, El). 

The predicate change_environments/3 ensures that each transition on in- 
put action from S 2 is evaluated in a separate environment, as required by late 
bisimulation. 

Discussion: Observe that the nested call to nbisim/2 in the definition of 
nsimulate inherits a new set of constraints from the guards on the two se- 
lected transitions as well as the values under which the actions are similar. In 
our encoding, the current context in which nbisim/2 is evaluated is maintained 
implicitly. This is a useful simplification as compared to the original algorithm 
of Hennessy and Lin [4], where the context of the bisimulation is maintained 
explicitly. The Hennessy-Lin algorithm returns the most general context under 
which the two processes are bisimilar. In a similar vein, when our encoding de- 
tects that two processes are not bisimilar, we can retrieve the context which 
witnesses the non-bisimilarity of the two processes. 

The complexity of the evaluation is 0(|S'| x | — >■ |) assuming unit-time 
table look up and constraint manipulation, which is same as Hennessy and Lin’s 
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Table 1. Time for symbolic bisimulation checking 
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stack{n) vs. stackin) 
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2.22 


3.68 


5.60 


8.14 


11.37 



procedural algorithm [4] . Furthermore, the encoding clearly separates the logical 
aspects of bisimulation from its representational aspects. 

Implementation: Section 4.1 reveals that the bisimulation checker requires both 
equality and disequality constraints. We have implemented a constraint meta- 
interpreter that handles tabled logic programs over equality and disequality 
constraints in XSB. The meta-interpreter maintains the constraint store and 
simplifies the constraints as they are propagated, thus simulating a tabled CLP 
environment. We use the traditional trick of trading off the costs associated with 
maintaining constraint stores always in canonical form against the cost of extra 
resolution steps due to undetected inconsistencies in non-canonical constraint 
stores. We have also implemented forall to correctly handle free variables in 
quantified formula over equality domain. Finally, our symbolic bisimulation al- 
gorithm is sound but proof of completeness depends on the constraint domain; 
whereas our bisimulation algorithm for finite-state systems is both sound and 
complete. Details are available at http://www.cs.sunysb.edu/~lmc/bisim. 

4.2 Experimental Results 

We measured the performance of our symbolic bisimulation checker on an infinite- 
state version of stack and queue. Stacks and queues, denoted by stack (n) and 
queue{n), of different buffer lengths (n) but with unspecified domain, were de- 
fined as STSs (e.g., see Figure 4) and encoded as sts_edge facts. Note that even 
for fixed values of n, stack{n) and queue{n) are infinite-state systems since each 
element in them can store arbitrary data values. 

Table 1 shows the time performance for checking early bisimulation compar- 
ing stack{n) with queue{n) and stack{n) with slacken). Times for late bisimi- 
larity checking are about 2% more than their early bisimulation counterpart due 
to the extra copy_term overhead. At first sight, this table displays an anomaly: 
time taken to check bisimilarity when two systems are not bisimilar (Table l(row 
1)) is more than that for systems which are bisimilar (Table l(row 2)) for the 
same buffer length. This appears to contradict our previous observation that 
local bisimulation explores less state space and takes less time when the sys- 
tems under consideration are not bisimilar as compared to the case when the 
systems are bisimilar. However closer inspection reveals that the symbolic state 
space explored in checking for bisimilarity between two stacks is much less than 
symbolic state space explored when checking for bisimilarity between a stack 
and a queue. In fact, the symbolic (global) state space explored for checking 
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bisimilarity between two stacks is linear in the buffer length. In contrast, the 
proof for non-bisimilarity of stack and queue depends both on buffer length and 
data domain size. It is worth mentioning that the state space explored for local 
bisimilarity checking depends greatly on the way the two transition systems are 
encoded. In case of checking for bisimilarity between a queue and a stack, if 
we first explore transitions with output {delete) actions before exploring those 
with input {insert) actions the states needed to be explored before the first 
non-similarity is detected is independent of the buffer length. 

It should also be noted that the symbolic state space of these systems may, 
in fact, be smaller than the ground state space even for data domain sizes as 
low as 2. For instance, consider the symbolic state space of a 2-element queue 
in Figure 4(b). Its symbolic state space has 3 states, since the states q\{x) and 
<7i {y) are simply variants of each other (i.e., identical modulo names of variables), 
and hence identified as a single symbolic state. In contrast, even for a 2-element 
data domain, say {1,2}, observe that <7i(l) is a state distinct from q\{2). More- 
over, q 2 {x,y) and q 2 {y,x) represent the same symbolic state, while <72(1,2) and 
<72(2, 1) behave differently. This is one of the key reasons why we can do symbolic 
bisimulation checking comparing two stacks with buffer length as high as 80 in 
around 11 seconds, whereas in the non-symbolic case, even comparing two stacks 
with buffer length of 18 each and with data domain size of just 2 explores over 
over 250iF states, taking over 270 seconds in that process. 

Finally, the meta-interpretation of constraints in the symbolic bisimulation 
imposes a heavy performance overhead. Using the symbolic bisimulation checker 
for LTSs (i.e., ground transition systems) is nearly 60 times slower than using the 
finite-state bisimulation checker. It is expected that a cleverer encoding of the 
constraint solver, together with the use of attributed variables [2] to integrate 
the solver with the LP engine, will significantly lower these overheads. 



5 Conclusion 

In this paper we demonstrated how the power and versatility of tabled logic 
programming can be used for checking bisimilarity of infinite-state systems in 
a natural way. Our implementation is goal-directed, i.e., we explore only states 
needed to prove or disprove the bisimilarity of the given states, and it can handle 
both early and late versions of strong as well as weak bisimilarity. Furthermore, 
the complexity of this implementation matches that of Hennessy and Lin’s algo- 
rithm modulo table-lookup time. Our experimental results show that the sym- 
bolic bisimulation checker over an infinite-state system can be considerably more 
efficient to use compared to regular bisimulation checking over LTSs generated 
by finite instances of these systems, even for relatively small domain sizes. Apply- 
ing the symbolic checker to real-life verification problems thus appears feasible 
despite significant overheads imposed by the constraint solver. 

A recent paper [8] explored the use of constraint logic programs for checking 
bisimilarity of timed systems, where timed systems are encoded by their cor- 
responding transition relation. The encoding used in that work can be seen as 
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specializing the nbisim predicate with respect to the transition relation. It is 
therefore expected that our encoding can be used for checking bisimilarity of 
timed systems. However, the performance of the checker will crucially hinge on 
the performance of a constraint solver for linear constraints needed to evaluate 
queries over timed systems. 
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Abstract. Tabled logic programming (LP) systems have been applied to 
elegantly and quickly solving very complex problems (e.g., model check- 
ing). However, techniques currently employed for incorporating tabling 
in an existing LP system are quite complex and require considerable 
change to the LP system. We present a simple technique for incorpo- 
rating tabling in existing LP systems based on dynamically reordering 
clauses containing variant calls at run-time. Our simple technique allows 
tabled evaluation to be performed with a single SLD tree and without 
the use of complex operations such as freezing of stacks and heap. It can 
be incorporated in an existing logic programming system with a small 
amount of effort. Our scheme also facilitates exploitation of parallelism 
from tabled LP systems. Results of incorporating our scheme in the com- 
mercial ALS Prolog system are reported. 



1 Introduction 

Traditional logic programming systems (e.g., Prolog) use SLD resolution with 
the following computation strategy [11]: subgoals of a resolvent are tried from 
left to right and clauses that match a subgoal are tried in the textual order 
they appear in the program. It is well known that SLD resolution may lead to 
non-termination for certain programs, even though an answer may exist via the 
declarative semantics. In fact, this is true of any “static” computation strat- 
egy that is adopted. That is, given any static computation strategy, one can 
always produce a program that will not be able to find the answers due to non- 
termination even though finite solutions may exist. In case of Prolog, programs 
containing certain types of left-recursive clauses are examples of such programs. 

To get around this problem, researchers have suggested computation strate- 
gies that are dynamic in nature coupled with recording solutions in a memo table. 
A dynamic strategy means that the decision regarding which clause to use next 
for resolution is taken based on run-time properties, e.g., the nature and type 
of goals in the current resolvent. OLDT [18] is one such computation strategy, 
in which solutions to certain subgoals are recorded in a memo table (heretofore 
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referred to simply as a table). Such a call that has been recorded in the table is 
referred to as a tabled call. In OLDT resolution, when a tabled call is encoun- 
tered, computation is started to try the alternative branches of the original call 
and to compute solutions, which are then recorded in the table. These solutions 
are called tabled solutions for the call. When a call to a subgoal that is identical 
to a previous call is encountered while computing a tabled call — such a call is 
called a variant call and may possibly lead to non-termination if SLD resolution 
is used — the OLDT resolution strategy will not expand it as SLD resolution will, 
rather the solutions to the variant call will only be obtained by matching it with 
tabled solutions. If any solutions are found in the table, they are consumed one 
by one just as a list of fact clauses by the variant call, each producing a solution 
for the variant call. Next, the computation of the variant subgoal is suspended 
until some new solutions appear in the table. This consumption and suspension 
continues, until all the solutions for the tabled call have been generated and 
a fixpoint reached. Tabled LP systems have been put to many innovative uses. 
A tabled LP system can be thought of as an engine for efficiently computing 
fixpoints. Efficient fixpoint computation is critical for many applications, e.g., 
model checking [14], program analysis [13], non-monotonic reasoning [3]. 

In this paper, we present a novel, simple scheme for incorporating tabling in 
a standard logic programming system. Our scheme, which is based on dynamic 
reordering of alternatives (DRA) that contain variant calls, allows one to incor- 
porate tabling in an existing LP system with very little effort. Using DRA we 
were able to incorporate tabling in the commercial ALS Prolog system [1] in a 
few months of work. The time efficiency of our tabled ALS (TALS) system is 
comparable to that of the XSB system [2,17,5,7,23] and B-Prolog [21], the two 
tabled LP systems currently available^. The space efficiency of TALS is compa- 
rable to that of B-Prolog and XSB with local scheduling and better than that of 
XSB with batch scheduling (XSB’s current default scheduling strategy). Unlike 
traditional implementations of tabling [2], DRA works with a single SLD tree 
without requiring suspension of goals and freezing of stacks. Additionally, no 
extra overhead is incurred for non-tabled programs. Intuitively, DRA builds the 
search tree as in normal Prolog execution based on SLD, however, when a variant 
tabled call is encountered, the branch that lead to that variant call is “moved” to 
the right of the tree. Essentially, branches of the search tree are reordered during 
execution to avoid exploring potentially non-terminating branches. The princi- 
pal advantage of DRA is that because of its simplicity it can be incorporated 
very easily and efficiently in existing Prolog systems. 

In our dynamic alternative reordering strategy, not only are the solutions to 
variant calls tabled, the alternatives leading to variant calls are also memorized 
in the table (these alternatives, or clauses, containing variant calls are called 
looping alternatives in the rest of the paper). A tabled call first tries its non- 
looping alternatives (tabling any looping alternatives that are encountered along 
the way). Finally, the tabled call repeatedly tries its looping alternatives until 
it reaches a fixpoint. This has the same effect as shifting branches with variant 

^ YAP [16] is another system with tabling; its implementation mimics XSB. 
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calls to the right in the search tree. The simplicity of our scheme guarantees that 
execution is not inordinately slowed down (e.g., in the B-Prolog tabled system 
[21], a tabled call may have to be re-executed several times to ensure that all 
solutions are found), nor considerable amount of memory used (e.g., in the XSB 
tabled system [2] a large number of stacks/heaps may be frozen at any given 
time), rather, the raw speed of the Prolog engine is available to execute even 
those programs that contain variant calls. 

An additional advantage of our technique for implementing tabling is that 
parallelism can be naturally exploited. In traditional tabled systems such as 
XSB, the ideas for parallelism have to be reworked and a new model of paral- 
lelism derived [6,16]. In contrast, in a tabled logic programming system based 
on dynamic reordering, the traditional forms of parallelism found in logic pro- 
gramming (or-parallelism and and-parallelism) can still be exploited. Work is in 
progress to augment the or-parallel ALS system [1,8] (currently being developed 
by us [20,9]) with tabling [9]. 

A disadvantage of our approach is that certain non-tabled goals occurring in 
looping alternatives may be computed more than once. However, this recomputa- 
tion can be eliminated by the use of tabling, automatic program transformation, 
or more sophisticated reordering techniques (see later). 

2 SLD and OLDT Resolution 

Prolog was initially designed to be a declarative language [11], i.e., a logic pro- 
gram with a correct declarative semantics should also get the same results via 
its procedural semantics. However, the operational semantics of standard Prolog 
systems that adopt SLD resolution (leftmost-first selection rule and a depth-first 
search rule) is not close to their declarative semantics. The completeness of SLD 
resolution ensures that given a query, the solutions implied by the program, if 
they exist, can be obtained through computation paths in the SLD tree [11]. 
However, standard Prolog systems with a pre-fixed computation rule may only 
compute a subset of these solutions due to problems with non-termination. 

Example 1. Consider the following program: 

r(X, Y) r(X, Z) , r(Z, Y) . (1) 

r(X, Y) p(X, Y) , q(Y) . (2) 

p(a, b) . p(a, d) . p(b, c) . 
q(b) . q(c) . 

table r/2. 
r(a, Y) . 

Following the declarative semantics of logic programs (e.g., employing bottom- 
up computation), the example program 1 above should produce two answers 
Y=h and Y=c. However, standard Prolog system will go into an infinite loop 
for this program. It is Prolog’s computation rule that causes the inconsistency 
between its declarative semantics and procedural semantics. With the leftmost- 
first selection rule and depth-first search rule, Prolog systems are trapped in an 
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infinite loop in the SLD-tree even though computation paths may exist to the 
solutions. It seems that breadth-first search strategy may solve the problem of 
infinite-looping, and it does help in finding the first solution. However, if the 
system is required to find all the solutions and terminate, breadth-first search is 
not enough, since the SLD tree may contain branches of infinite length. 

To get around this problem, a tabled evaluation strategy called OLDT is used 
in tabled logic programming systems such as XSB. In the most widely available 
tabled Prolog systems, XSB, OLDT is implemented in the following way^. When 
a call to a tabled predicate is encountered for the first time, the current com- 
putation is suspended and a new SLD tree is built to compute the answers to 
this tabled call. The new tree is called a generator, while the old tree (which led 
to the tabled call) is called a consumer w.r.t. the new tabled call. When a call 
that is a variant of a previous call — and that may potentially cause infinite loop 
under SLD — is encountered in the generator SLD tree, XSB first consumes the 
tabled solutions of that call (i.e., solutions that have already been computed by 
the previous call). If all the tabled solutions have been exhausted, the current 
call is suspended until some new answers are available in the table. Finally, the 
solutions produced by the generator SLD tree are consumed by the consumer 
SLD tree after its execution is resumed. In XSB, the suspension of the consumer 
SLD tree is realized by freezing the stacks and heap. An implementation based 
on suspension and freezing of stacks may be quite complex to realize as well as it 
can incur substantial overhead in terms of time and space. Considerable effort is 
needed to make such a system very efficient. In this paper, we present a simple 
scheme for incorporating tabling in a Prolog system in a small fraction of this 
time. Additionally, our system is comparable in efficiency to existing systems 
w.r.t. time and space. 

The OLDT resolution forest for example 1 following XSB style execution is 
shown in figure 1. (The figure also shows the memo-table used for recording 
solutions; the numbers on the edges of the tree indicate the order in which XSB 
will generate those edges). Compared to SLD, OLDT has several advantages: 
(i) A tabled Prolog system avoids redundant computation by memoing the com- 




^ The current XSB system uses SLG resolution (OLDT augmented with negation). 
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puted results; in some cases, it can reduce the time complexity of a problem 
from exponential to polynomial, (ii) A tabled Prolog system terminates for all 
queries posed to bounded term-sized programs that have a finite least fixpoint. 
(iii) Tabled Prolog keeps the declarative and procedural semantics of definite 
Prolog programs consistent. 



3 Dynamic Reordering of Alternatives (DRA) 

We present a simple technique for implementing tabling that is based on dy- 
namic reordering of looping alternatives at run-time, where a looping alternative 
refers to a clause that matches a tabled call containing a recursive variant call. 
Intuitively, our scheme works by reordering the branches in SLD trees. Branches 
containing variant calls are moved to the right in the SLD tree for the query. In 
our scheme, a tabled call can be in one of three possible states: normal state, 
looping state, or complete state. The state transition graph is shown in figure 2. 




Fig. 2. State Transition Graph 



Consider any tabled call C, normal state is initially entered when C is first 
encountered during the computation. This first occurrence of C is allowed to 
explore the matched clauses as in a standard Prolog system (normal state). In 
normal state, while exploring the matching clauses, the system tables all the 
solutions generated for the call C in this state and also checks for variants of C. 
If a variant is found, the current clause that matches the original call to C will be 
memorized, i.e., recorded in the table, as a looping alternative. This call will not 
be expanded at the moment because it can potentially lead to an infinite loop. 
Rather it will be solved by consuming the solutions from the table that have been 
computed by other alternatives. To achieve this, the alternative corresponding 
to this call will be reordered and placed at the end of the alternative list in 
the choice-point. A failure will be simulated and the alternative containing the 
variant will be backtracked over. After exploring all the matched clauses (some 
of which were possibly tabled as looping alternative), C goes into its looping 
state. From this point, tabled call C keeps trying its tabled looping alternatives 
repeatedly (by putting the alternative again at the end of the alternative list 
after it has been tried) until C is completely evaluated. If no new solution is 
added to C’s tabled solution set in any one cycle of trying its tabled looping 
alternatives, then we can say that C has reached its fixpoint. 

C enters its complete state after it reaches its fixpoint, i.e., after all solutions 
to C have been found. In the complete state, if the call C is encountered again 
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later in the computation, the system will simply use the tabled solutions recorded 
in the table to solve it. In other words, C will be solved simply by consuming its 
tabled solution set one after another as if trying a list of facts. 

Considerable research has been devoted to evaluating recursive queries in 
the field of deductive databases [4] . Intuitively, the DRA scheme can be thought 
roughly equivalent to the following deductive query evaluation scheme for com- 
puting fixpoints of recursive programs: (i) first find all solutions to the query 
using only non-recursive clauses in a top-down fashion, (ii) use this initial solu- 
tion set as a starting point and compute (semi-naively) the fixpoint using the 
recursive clauses in a bottom up fashion. By using the initial set obtained from 
top-down execution of the query using non-recursive clauses, only the answers 
to the query are included in the final fixpoint. Redundant evaluations are thus 
avoided as in Magic set evaluation. The proof of correctness of DRA is based on 
formally showing its equivalence to this evaluation scheme [9,10]. 

Example 2. Consider resolving the following program’s evaluation using DRA: 
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Figure 3 gives the computation tree produced by DRA for example 2 (note 
that the labels on the branch refer to the clause used for creating that branch). 
Both clause (1) and clause (3) need to be tabled as looping alternatives for 
the tabled call r(a, Y) (this is accomplished by operations a_add: (1) and 
a_add: (3) shown in Figure 3). The second alternative is a non-looping alter- 
native that produces a solution for the call r(a,Y) which is recorded in the 
table (via the operation s_add shown in the Figure). The query call r(a, Y) 
is a master tabled call (since it is the first call), while all the occurrences of 
r(a, Z) are slave tabled calls (since they are calls to variant of r(a,Y)). When 
the call r(a, Y) enters its looping state, it keeps trying the looping alternatives 
repeatedly until the solution set does not change any more, i.e., until r(a, Y) is 
completely evaluated (this is accomplished by trying a looping alternative, and 
then moving it to the end of the alternatives list). Note that if we added two 
more facts: p(d,e) and q(e,f), then we’ll have to go through the two looping 
alternatives one more time to produce the solutions r(a,e) and r(a,f). 

An important problem that needs to be addressed in any tabled system is 
detecting completion. When there are multiple tabled calls occurring simulta- 
neously during the computation, and results produced by one tabled call may 
depend on another’s, then knowing when the computation of a tabled call is com- 
plete (i.e., all solutions have been computed) is quite hard. Completion detection 
based on finding strongly connected components (SCC) has been implemented in 
the TABS system (details are omitted due to lack of space and can be found else- 
where [9,10]). Completion detection is very similar to the procedure employed 
in XSB and the issues are illustrated in the next two examples. 
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r(a,Y) . 




Calls 


Solution 


Looping 

Alternatives 


r(a,Y) 


r(a,b) 

r(a,c) 

r(a,d) 


(1) 

(3) 



Fig. 3. DRA for Example 2 



Example 3. Consider resolving the following program with DRA: 

r(X, Y) r(X, Z) , r(Z, Y) . (1) 

r(X, Y) p(X, Y) , q(Y) . (2) 

p(a, b) . p(a, d) . p(b, c) . 
q(b) . q(c) . 

table r/2. 
r(a, Y) . 

As shown in the computation tree of Figure 4, the tabled call r(b, Y) is 
completely evaluated only if its dependent call r(c, Y) is completely evaluated, 
and r(a, Y) is completely evaluated only if its dependent calls, r(b, Y) and 
r(c, Y), are completely evaluated. Due to the depth-first search used in TALS, 
r(c, Y) always enters its complete state ahead of r(b, Y), and r(b, Y) ahead 
of r(a, Y). The depth- first strategy with alternative reordering guarantees for 
such dependency graphs (i.e., graphs with no cycles) that dependencies can be 




Fig. 4. DRA for Example 3 
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satisfied without special processing during computation. However, these depen- 
dencies can be cyclic as in the following example. 

Example 4- Consider resolving the following program with DRA: 

r(X, Y) p(X, Z) , r(Z, Y) . (1) 

r(X, Y) p(X, Y) . (2) 

p(a, b) . p(b, a), 

table r/2. 
r(a, Y) . 

Figure 5 shows the complete computation tree of example 4. In this example, 
two tabled calls, r(a, Y) and r(b, Y), are dependent on each other, forming 
a see in the completion dependency graph. It is not clear which tabled call is 
completely evaluated first. A proper semantics can be given to the program only 
if all tabled calls in a SCC reach their complete state simultaneously. According 
to depth-first computation strategy, the least deep tabled call of each SCC should 
be the last tabled call to reach its fixpoint in its SCC. To detect completion 
correctly, the table is extended to record the least deep tabled call of each SCC, 
so that the remaining calls in the SCC can tell whether they are in the complete 
state by checking the state of the least deep call. The state of a tabled call 
can be set to “complete” only after its corresponding least deep call is in a 
complete state. In this example, there are two occurrences of r (b, Y) during the 
computation. In its first occurrence, r(b, Y) can not be set to “complete” even 
though it reaches a temporary fixpoint after exploring its looping alternative, 
because it depends on the tabled call r (a, Y) , which is not completely evaluated 
yet. If the call r(b, Y) is set to “complete” state at this point, a solution r(b, 
b) will be lost. Only after the tabled call r(a, Y) is completely evaluated during 
its looping state, can the tabled call r(b, Y) (within the same SCC with r(a, 
Y)) be set to complete state. 

4 Implementation 

The DRA scheme can be easily implemented on top of an existing Prolog sys- 
tem. TALS is an implementation of DRA on top of the commercial ALS Prolog 
system. In the TALS system, tabled predicates are explicitly declared. Tabled 
solutions are consumed incrementally to mimic semi-naive evaluation [4]. Mem- 
ory management and execution environment can be kept the same as in a regular 
Prolog engine. Two main data structures, table and tabled choice-point stack, are 
added to the TALS engine. The table data structure is used to keep information 
regarding tabled calls such as the list of tabled solutions and the list of looping 
alternatives for each tabled call, while tabled choice-point stack is used to record 
the properties of tabled call, such as whether it is a master call (the very first 
call) or a slave call (call to the variant in a looping alternative). The master 
tabled call is responsible for exploring the matched clauses, manipulating execu- 
tion states, and repeatedly trying the looping alternatives and solutions for the 
corresponding tabled call, while slave tabled calls only consume tabled solutions. 
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Calls 


Solution 


Looping 

Alternatives 


Dependency 


r(a,Y) 


r(a,a) 

r(a,b) 


(1) 


r(a,Y) 


r(b,Y) 


r(b,a) 

r(b,b) 


(1) 


r(a,Y) 



Completion 

Dependency 



Fig. 5. Example 4 



The allocation and reclaiming of master and slave choice-points is similar to reg- 
ular choice-points, except that the former have a few extra fields to manage the 
execution of tabled calls. 

Very few changes are required to the WAM engine of a Prolog system to 
implement the DRA scheme (more implementation details can be found else- 
where [9]). We introduce six new WAM instructions, needed for tabled predi- 
cates: table_try_me_else, table_retry_me_else, table_trust_me, table_loop, 
table_consume, and table_save. We differentiate between tabled calls and non- 
tabled calls at compile-time, and generate appropriate type of WAM try instruc- 
tions. For regular calls, WAM try_me_else, retry _me_else, and trust_me_else, 
instructions are generated to manage the choice-points, while for tabled calls, 
these are respectively modified to table_try_me_else, table_retry_me_else, 
and table_trust_me_else instructions. Every time table_try_me_else is in- 
voked, we have to check if the call is a variant of a previous call. If the call is a 
variant, the address of the WAM code corresponding to this clause is recorded 
in the table as a looping alternative. The variant call is treated as a slave tabled 
call, which will only consume tabled solutions if there are any in the table, 
and will not explore any matched clauses. The next-alternative-field of the slave 
choice-point is changed to table_consmne so that it repeatedly consumes the 
next available tabled solutions. If the call is a new tabled call, it will be added 
into the table by recording the starting code address and its arguments informa- 
tion. This new tabled call is treated as a master tabled call, which will explore 
the matched clauses and generate new solutions. The continuation instruction 
of a master tabled call is changed to a new WAM instruction table_save, which 
checks if generated solution is new. If so, the new solution is tabled, and ex- 
ecution continues with the sub-goal after the tabled call as in normal Prolog 
execution. When the last matched alternative of the master choice-point is tried 
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by first executing the table_trust jme instruction, the next-alternative-field of 
the master choice-point is set to the instruction table_loop, so that after fin- 
ishing the last matched alternative, upon backtracking, the system will enter 
the looping state to try the looping alternatives. After a fixpoint is reached, and 
all the solutions have been computed, this instruction is changed to the WAM 
trust_me_f ail instruction, which de-allocates the choice-point and simulates a 
failure, as in normal Prolog execution. 



5 Recomputation Issues in DRA 

The main idea in the TALS system is to compute the base solution set for 
a tabled call using clauses not containing variants, then repeatedly applying 
the clauses with variants (looping alternatives) on this base solution set until 
the fixpoint of the tabled call is reached. Due to the looping alternatives being 
repeatedly executed, certain non-tabled goals occurring in these clauses may be 
unnecessarily re-executed. This recomputation can affect the overall efficiency 
of the TALS system. Non-tabled calls may be redundantly re-executed in the 
following three situations. 

First, while trying a looping alternative, the whole execution environment 
has to be built again until a slave tabled choice-point is created. Consider the 
looping alternative: 

p(X, Y) q(X), p(X, Z) , r(Z, Y) . 

table p/2. 

: - p(a, X) . 

Suppose p/2 is a tabled predicate, while q/1 and r/2 are not. Then each time 
this alternative is tried, q(X) has to be computed since it is not a tabled call. 
That is, the part between the master tabled call and slave tabled call has to be 
recomputed when this alternative is tried again. 

Second, false looping alternatives may occur and may require recomputation. 
Consider the program below: 

p(l). 

p(2). 

table p/1. 
p(X), p(Y). 

After the first goal p(X) gets the solution p(l), a variant call of p(X), namely, 
p(Y), is met. According to the DRA scheme, the explored clause is then tabled 
as a looping alternative. However, all the matched clauses p(l) and p(2) are 
determinate facts, which will not cause any looping problem. The reason we 
falsely think that there is a looping alternative is because it is difficult to tell 
whether p(Y) is a descendant of p(X) or not. Even worse, the false looping 
alternatives will generate the solutions in a different order (order is “X=l, Y=l”, 
“X=2, Y=l”, “X=2, Y=2”, “X=l, Y=2” versus “X=l, Y=l”, “X=l, Y=2”, “X=2, Y=l”, 
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“X=2, Y=2” for standard Prolog). This problem of false looping alternative is also 
present in XSB and B-Prolog. 

Third, a looping alternative may have multiple clause definitions for its non- 
tabled subgoals. Each time a looping alternative is re-tried, all the matching 
clauses of its non-tabled subgoals have to be computed. For example: 



p(a, 


b). 










p(X, 


Y) : 


- p(X, 


Z), 


q(Z, Y). 


(1) 


p(X, 


Y) : 


- t(x, 


Y). 




(2) 


t(X, 


Y) : 


- p(X, 


Z), 


s(Z, Y). 


(3) 


t(X, 


Y) : 


- s(X, 


Y). 




(4) 



table p/2. 

: - p(a, X) . 

For the query p(a, X), clause (1) and clause (2) are two looping alternatives. 
Consider the second looping alternative. The predicate p (X , Y) is reduced to the 
predicate t (X , Y) , which has two matching clauses. The first matching clause of 
t (X , Y) , clause (3) , leads to a variant call of p (X , Y) , while the second match- 
ing clause, clause (4), is a determinate clause. However, each time the looping 
alternative, clause (2), is re-tried, both matching clauses for the predicate t(X, 
Y) are tried. However, because clause (4) does not lead to any variant of the 
tabled call, this recomputation is wasted. 

In the first case, fortunately, recomputation can be avoided by explicitly 
tabling the predicate q/1, so that q(X) can consume the tabled solutions of 
q instead of recomputing them. XSB does not have this problem with recom- 
putation, because XSB freezes the whole execution environment, including the 
computation state of q(X) , when the variant call p(X, Z) is reached. This freezing 
of the computation state of q(X) amounts to implicitly tabling it. 

The second case can be solved by finding the scope of the master call. If 
we know that p(Y) is out of the scope of p(X), we can compute p(X) first, 
then let the variant call p(Y) only consume the tabled solutions. However, one 
assumption is that the tabled call p(X) has a finite fixpoint and thus can be 
completely evaluated. 

The final case can be handled in several ways. One option is to table the spe- 
cific computation paths leading to the variants of a previous tabled call instead 
of the whole looping alternative. However, tabling the computation paths will 
incur substantial overhead. Second option is to table the non-tabled predicates, 
such as t(X, Y), so that the determinate branches of t(X, Y) will not be re- 
tried. A third option is to unfold the call to t(X,Y) in the clause (2) of predicate 
p so that the intermediate predicate t(X,Y) is eliminated. 

Thus, all cases where non-tabled goals may be redundantly executed can 
be eliminated. Note that tabling of goals q(X) in case (i) and of goal t(X,Y) 
in case (iii) can be done automatically. The unfolding in case (iii) can also be 
done automatically. At present, in TALS these transformations have to be done 
manually. 
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6 Related Work 

The most mature implementation of tabling is the XSB [2,19] system from SUNY 
Stony Brook. As discussed earlier, the XSB system implements OLDT by devel- 
oping a forest of SLD trees, suspension of execution via freezing of correspond- 
ing stacks/heap, and resumption of execution via their unfreezing. Recently, 
improvements of XSB, called CAT and CHAT [5], that reduce the amount of 
storage locked up by freezing, have been proposed. Of these, the CHAT system 
seems to achieve a good balance between time and space overhead since it only 
freezes the heap, the state of the other stacks is captured and saved in a special 
memory area (called CHAT area). 

Because of considerable investment of effort in design and optimization of 
the XSB system [2,17,5,23], XSB has turned out to be an extremely efficient 
system. The modified WAMs that have been designed [23,17], the research done 
in scheduling strategies [7] for reducing the number of suspensions and reducing 
space usage [5] are crucial to the efficiency of the XSB system. Ease of imple- 
mentation and space efficiency are the main advantages of DRA. The scheme 
based on DRA is quite simple to implement on an existing WAM engine, and 
produces performance that is comparable to XSB. 

Recently, another implementation of a tabled Prolog system — based on SLOT 
and done on top of an existing Prolog system called B-Prolog — has been re- 
ported [21]. The main idea behind SLDT is as follows: when a variant is recur- 
sively reached from a tabled call, the active choice-point of the original call is 
transferred to the call to the variant (the variant steals the choice-point of the 
original call, using the terminology in [21]). Suspension is thus avoided (in XSB, 
the variant call will be suspended and the original call will produce solutions via 
backtracking) and the computation pattern is closer to SLD. However, because 
the variant call avoids trying the same alternatives as the previous call, the com- 
putation may be incomplete. Thus, repeated recomputation [21] of tabled calls is 
required to make up for the solutions lost and to make sure that the fixpoint is 
reached. Additionally, if there are multiple clauses containing recursive variant 
calls, the variant calls may be encountered several times in one computation 
path. Since each variant call executes from the backtracking point of a former 
variant call, a number of solutions may be lost. These lost solutions have to be 
found by recomputation. This recomputing may have to be performed several 
times to ensure that a fixpoint is reached, compromising performance. 

Observe that the DRA (used in TABS) is not an improvement of SLDT 
(used in B-Prolog) rather a completely new way of implementing a tabled LP 
system (both were conceived independently) . The techniques used for evaluating 
tabled calls and for completion detection in the TALS system are quite different 
(even though implementations of both DRA and SLDT seem to be manipulating 
choice-points). B-Prolog is efficient only for certain types of very restricted pro- 
grams (referred to as directly recursive in [21]; i.e., programs with one recursive 
rule containing one variant call). For programs with multiple recursive rules or 
with multiple variant calls, the cost of computing the fixpoint in B-Prolog can 
be substantial. 
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7 Performance Results 

The TALS system has been implemented on top of the WAM engine of the 
commercial ALS Prolog system. It took us less than two man-months to re- 
search and implement the dynamic reordering of alternatives (DRA) scheme 
(with semi-naive evaluation) along with full support for complex terms on top 
of commercial ALS Prolog system (tabled negation is not yet supported, work 
is under way). Our performance data indicates that in terms of time and space 
efficiency, our scheme is comparable to XSB and B-Prolog. The main advantage 
of the DRA scheme is that it can be incorporated relatively easily in existing 
Prolog systems. Note that the most recent releases of XSB (version 2.3) and 
B-Prolog (version 5.0) were used for performance comparison. All systems were 
run on a machine with 700MHz Pentium processor and 256MB of main memory. 
Note that comparing systems is a tricky issue since all three systems employ a 
different underlying Prolog engine. Table 1 shows the performance of the three 
systems on regular Prolog programs (i.e., no predicates are tabled) and gives 
some idea regarding the relative speed of the engines employed by the 3 sys- 
tems (arithmetic on the TALS system seems to be slow, which is the primary 
reason for its poor performance on the 10-Queens and Knight benchmarks com- 
pared to other systems). Note that Sg is the “cousin of the same generation” 
program, 10-Queen is the instance of N-Queen problem, Knight is the Knight’s 
tour, Color is the map-coloring problem, and Hamiilton is the problem of finding 
Hamiltonian cycles in a graph. Note that all figures for all the systems are for 
all solution queries. 

In general, the time performance of TALS on most of the CHAT benchmarks 
is worse than that of XSB, however, it is not clear how much of it is due to the 
differences in base engine speed, and how much is due to TALS’ recomputation of 
non-tabled goals leading up to looping alternatives (the fix for this described in 
section 5 could not be used, as the CHAT benchmarks are automatically gener- 
ated from some pre-processor and are unreadable by humans). However, except 
for read the performance is comparable, (i.e., it is not an order of magnitude 
worse). With respect to B-Prolog the time-performance is mixed. For programs 
with multiple looping alternatives TALS performs better than B-Prolog. Table 2 
compares the time efficiency among XSB, B-Prolog, and TALS system. These 
benchmarks are taken from the CHAT suite of benchmarks distributed with 
XSB and B-Prolog.^ Most of these benchmarks table multiple predicates many 
of whom use structures. For XSB, timings for both batch scheduling (XSB-b) 
and local scheduling (XSB-1) are reported (Note that batch scheduling is cur- 
rently the default scheduling strategy in XSB since local scheduling assumes an 
all-solutions query). 

Tables 3 and 4 compare the space used by TALS, XSB (both batch and local 
scheduling), and B-Prolog systems. Table 3 shows the total space used by the 

® Note that benchmarks used in Table 1 will not benefit much from tabling, except for 
sg, so a different set of benchmarks is used; most of the benchmarks used in Table 
2 cannot be executed under normal Prolog. 
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Table 1. Running Time (Seconds) on Non-tabled Programs 



Benchmarks 


10-Queen 


Sg 


Knight 


Color 


Hamilton 


XSB 


0.441 


0.301 


2.63 


0.08 


1.18 


B-Prolog 


0.666 


0.083 


3.15 


0.233 


2.667 


TALS 


2.46 


0.19 


11.26 


0.38 


1.48 



Table 2. Running Time (Seconds) on Tabled Programs 



Benchmark 


cs_o 


cs_r 


disj 


gabriel 


kalah 


peep 


pg 


read 


sg 


TALS 


0.16 


0.37 


0.26 


0.72 


0.42 


0.52 


0.29 


5.94 


0.04 


XSB-b 


0.081 


0.16 


0.05 


0.06 


0.05 


0.18 


0.05 


0.23 


0.06 


XSB-l 


0.071 


0.13 


0.041 


0.05 


0.04 


0.131 


0.041 


0.18 


0.05 


B-Prolog 


0.416 


0.917 


0.233 


0.366 


0.284 


1.417 


0.250 


0.883 


0.084 



Table 3. Total Space Usage in Bytes (Excluding Table Space) 



Benchmark 


cs_o 


cs_r 


disj 


gabriel 


kalah 


peep 


pg 


read 


sg 


TALS 


8360 


8438 


12193 


17062 


23520 


6800 


20084 


20426 


2226 


XSB-b 


11040 


13820 


10012 


30356 


43628 


1148296 


436012 


1600948 


3096 


XSB-l 


6992 


8584 


6876 


23156 


9564 


19448 


16324 


125342 


3540 


B-Prolog 


21040 


38592 


16484 


37596 


61288 


96884 


64232 


72916 


1664 



Table 4. Space Overhead for Tabling in Bytes 



Benchmark 


cs_o 


cs_r 


disj 


gabriel 


kalah 


peep 


pg 


read 


sg 


TALS 


672 


750 


213 


190 


376 


976 


420 


2666 


342 


XSB-b 


2544 


4016 


2568 


16172 


16784 


1132596 


363872 


1356672 


0 


XSB-l 


696 


1392 


1632 


10848 


1612 


7732 


7768 


63720 


0 


B-Prolog 


n/a 


n/a 


n/a 


n/a 


n/a 


n/a 


n/a 


n/a 


n/a 



system. This space includes total stack and heap space used as well as space 
overhead to support tabling (but excluding space used for storing the table). 
The space overhead to support tabling in case of TALS includes the extra space 
needed to record looping alternatives and extra fields used in master and slave 
choice-points. In case of both XSB-1 and XSB-b, the figure includes the CHAT 
space used. For B-Prolog it is difficult to separate this overhead from the actual 
heap + stack usage. Space overhead incurred is separately reported in Table 4. 
As can be noticed from Table 3, the space performance of TALS is significantly 
better than that of XSB-b (for some benchmarks, e.g., peep, pg and read, it is 
orders of magnitude better) . It is also better than the space performance of B- 
Prolog (perhaps due to the extra space used during recomputation in B-Prolog) 
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Table 5. Table Space Usage in Bytes 



Benchmark 


cs_o 


cs_r 


disj 


gabriel 


kalah 


peep 


Pg 


read 


sg 


TALS 


21056 


21400 


6488 


7244 


13496 


17256 


3852 


15404 


25128 


XSB-h 


26572 


27072 


22768 


199948 


35784 


22688 


15876 


48032 


47568 


XSB-l 


25356 


25858 


21592 


19076 


34160 


21920 


15108 


45944 


42448 


B-Prolog 


20308 


20396 


20104 


16492 


26884 


15260 


13860 


38388 


69740 



and is comparable in performance to XSB-1. For completeness sake, we also 
report the space used in storing the table for each of the 4 systems in Table 5. 

8 Conclusion and Future Work 

The advantages of DRA can be listed as follows: (i) It can be easily implemented 
on top of an existing Prolog system without modifying the kernel of WAM engine 
in any major way; (ii) It works with a single SLD tree without suspension of goals 
and freezing of stacks resulting in less space usage; (iii) Unlike SLDT, it avoids 
blindly recomputing subgoals (to ensure completion) by remembering looping 
alternatives; (iv) Unlike XSB with local scheduling it produces solutions for 
tabled goals incrementally while maintaining good space and time performance 
(v) Parallelism can be easily incorporated in the DRA scheme. 

Our alternative reordering strategy can be thought of as a dual [12] of the 
Andorra-principle [22]. In the Andorra model of execution, goals in a clause 
are reordered (on the basis of run-time properties, e.g., determinacy) leading 
to a considerable reduction in search space and better termination behavior. 
Likewise, our tabling scheme based on reordering alternatives (which correspond 
to clauses) also reduces the size of the computation (since solutions for tabled 
call once computed are remembered) and results in better termination behavior. 

Our scheme is quite simple to implement. We were able to implement it on 
top of an existing Prolog engine (ALS Prolog) in a few weeks of work. Perfor- 
mance evaluation shows that our implementation is comparable in performance 
to highly-engineered tabled systems such as XSB. Work is in progress to add 
support for tabled negation and or-parallelism, so that large and complex appli- 
cations (e.g., model-checking) can be tried. 
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Abstract. In the paper we establish the fixed-parameter complexity 
for several parameterized decision problems involving models, supported 
models and stable models of logic programs. We also establish the fixed- 
parameter complexity for variants of these problems resulting from re- 
stricting attention to Horn programs and to purely negative programs. 
Most of the problems considered in the paper have high fixed-parameter 
complexity. Thus, it is unlikely that fixing bounds on models (supported 
models, stable models) will lead to fast algorithms to decide the existence 
of such models. 



1 Introduction 

In this paper we study the complexity of parameterized decision problems con- 
cerning models, supported models and stable models of logic programs. In our 
investigations, we use the framework of the fixed-parameter complexity intro- 
duced by Downey and Fellows [DF97]. This framework was previously used to 
study the problem of the existence of stable models of logic programs in [TruOl]. 
Our present work extends results obtained there. First, in addition to the class of 
all finite propositional logic programs, we consider its two important subclasses: 
the class of Horn programs and the class of purely negative programs. Second, 
in addition to stable models of logic programs, we also study supported models 
and arbitrary models. 

A decision problem is parameterized if its inputs are pairs of items. The second 
item in a pair is referred to as a parameter. The problems to decide, given a logic 
program P and an integer k, whether P has a model, supported model or a stable 
model, respectively, with at most k atoms are examples of parameterized decision 
problems. These problems are NP-complete. However, fixing k (that is, k is no 
longer regarded as a part of input) makes each of the problems simpler. They 
become solvable in polynomial time. The following straightforward algorithm 
works: for every subset M C At{P) of cardinality at most fc, check whether M 
is a model, supported model or stable model, respectively, of P. The check can 
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be implemented to run in linear time in the size of the program. Since there are 
0{n^) sets to be tested, the overall running time of this algorithm is 0{mn^), 
where m is the size of the input program P and n is the number of atoms in P. 

The problem is that algorithms with running times given by 0(mn^) are not 
practical even for quite small values of k. The question then arises whether better 
algorithms can be found, for instance, algorithms whose running-time estimate 
would be given by a polynomial of the order that does not depend on k. Such 
algorithms, if they existed, could be practical for a wide range of values of k and 
could find applications in computing stable models of logic programs. 

This question is the subject of our work. We also consider similar questions 
concerning related problems of deciding the existence of models, supported mod- 
els and stable models of cardinality exactly k and at least k. We refer to all 
these problems as small-hound problems since k, when fixed, can be regarded 
as “small”. In addition, we study problems of existence of models, supported 
models and stable models of cardinality at most \At{P) \ — k, exactly \At{P) \ — k 
and at least \At{P)\ — k. We refer to these problems as large-hound problems, 
since |^t(P)| — k, for a fixed k, can be informally thought of as “large”. 

We address these questions using the framework of fixed-parameter complex- 
ity [DF97]. Most of our results are negative. They provide strong evidence that 
for many parameterized problems considered in the paper there are no algorithms 
whose running time could be estimated by a polynomial of order independent 
of k. 

Formally, a parameterized decision problem is a set L C S* x S* , where 
P is a fixed alphabet. By selecting a concrete value a G S* of the parameter, 
a parameterized decision problem L gives rise to an associated fixed-parameter 
problem = {x : (x, a) G L}. 

A parameterized problem L C S* x E* is fixed-parameter tractable if there 
exist a constant t, an integer function / and an algorithm A such that A deter- 
mines whether (x,y) G L in time /(|y|)|a:|‘ (| 2 | stands for the length of a string 
z G E*). We denote the class of fixed-parameter tractable problems by FPT. 
Clearly, if a parameterized problem L is in FPT, then each of the associated 
fixed-parameter problems Ly is solvable in polynomial time by an algorithm 
whose exponent does not depend on the value of the parameter y. Parameter- 
ized problems that are not fixed-parameter tractable are called fixed-parameter 
intractable. 

To study and compare the complexity of parameterized problems Downey 
and Fellows proposed the following notion of fixed-parameter reducihility (or, 
simply, reducihility). 

Definition 1. A parameterized problem L can he reduced to a parameterized 
problem L' if there exist a constant p, an integer function q, and an algorithm 
A such that: 

1. A assigns to each instance (x,y) of L an instance {x',y') of L' , 

2. A runs in time 0((7(|t/|)|a;|P), 

3. x' depends upon x and y, and y' depends upon y only, 

4 . (x,y) G L if and only if{x',y') G L' . 
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We will use this notion of reducibility throughout the paper. If for two param- 
eterized problems L\ and L 2 , L\ can be reduced to L 2 and conversely, we say 
that Li and L 2 are fixed-parameter equivalent or, simply, equivalent. 

Downey and Fellows [DF97] defined a hierarchy of complexity classes called 
the W hierarchy. 

FPT C W[l] C W[2] C W[3] C . . . . (1) 

The classes W[t] can be described in terms of problems that are complete for 
them (a problem D is complete for a complexity class S ii D £ S and every 
problem in this class can be reduced to D). Let us call a Boolean formula t- 
normalized if it is a conjunction-of-disjunctions-of-conjunctions ... of literals, 
with t being the number of conjunctions-of, disjunctions-of expressions in this 
definition. For example, 2-normalized formulas are conjunctions of disjunctions 
of literals. Thus, the class of 2-normalized formulas is precisely the class of CNF 
formulas. We define the weighted t-normalized satisfiability problem as: 

WS{t): Given a t-normalized formula ^ and a non-negative integer k, decide 
whether there is a model of <P with exactly k atoms (or, alternatively, decide 
whether there is a satisfying valuation for <P which assigns the logical value 
true to exactly k atoms) . 

Downey and Fellows show that for every t > 2, the problem WS{t) is complete 
for the class W[t] . They also show that a restricted version of the problem WS{2)-. 

WS2{2): Given a 2-normalized formula with each clause consisting of at most 
two literals, and an integer k, decide whether there is a model of with 
exactly k atoms 

is complete for the class W[l]. There is strong evidence suggesting that all the 
implications in (1) are proper. Thus, proving that a parameterized problem is 
complete for a class W[t], t > 1, is a strong indication that the problem is not 
fixed-parameter tractable. 

As we stated earlier, in the paper we study the complexity of parameterized 
problems related to logic programming. All these problems ask whether an input 
program P has a model, supported model or a stable model satisfying some 
cardinality constraints involving another input parameter, an integer k. They 
can be categorized into two general families: small-bound problems and large- 
bound problems. In the formal definitions given below, C denotes a class of logic 
programs, T> represents a class of models of interest and A stands for one of the 
three arithmetic relations: “<”, “=” and “>”. 

'Da{C)- Given a logic program P from class C and an integer k, decide whether 
P has a model M from class T> such that \M\ A k. 

T>'^{C): Given a logic program P from class C and an integer k, decide whether 
P has a model M from class V such that (|At(P)| — k) A \M\. 

In the paper, we consider three classes of programs: the class of Horn pro- 
grams "H, the class of purely negative programs A/”, and the class of all programs 
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A. We also consider three classes of models: the class of all models Ai, the class 
of supported models SV and the class of stable models ST- 

Thus, for example, the problem SV<{Af) asks whether a purely negative logic 
program P has a supported model M with no more than k atoms (|M| < k). The 
problem (^) asks whether a logic program P (with no syntactic restrictions) 
has a stable model M in which at most k atoms are false (|d.t(P)| — k < \M\). 
Similarly, the problem Ai'^{Ti) asks whether a Horn program P has a model M 
in which at least k atoms are false (|Ht(P)| — k > \M\). 

In the three examples given above and, in general, for all problems P/i(C) 
and the input instance consists of a logic program P from the class C 

and of an integer k. We will regard these problems as parameterized with k. 
Fixing k (that is, k is no longer a part of input but an element of the problem 
description) leads to the fixed-parameter versions of these problems. We will 
denote them V^iCA) and V'^{C,k)j respectively. 

In the paper, for all but three problems V^{C) and V'^{C), we establish their 
fixed-parameter complexities. Our results are summarized in Tables 1-3. 

In Table 1, we list the complexities of all problems in which A = “>” . Small- 
bound problems of this type ask about the existence of models of a program P 
that contain at least k atoms. Large-bound problems in this group are concerned 
with the existence of models that contain at most \At{P) \ — k atoms (the number 
of false atoms in these models is at least k) . From the point of view of the fixed- 
parameter complexity, these problems are not very interesting. Several of them 
remain NP-complete even when k is fixed. In other words, fixing k does not 
simplify them enough to make them tractable. For this reason, all the entries in 
Table 1, listing the complexity as NP-complete (denoted by NP-c in the table), 
refer to fixed-parameter versions T>>{C, k) and T>y{C, k) of problems 'D>{C) and 
T>'^{C). The problem k) is NP-complete for every fixed fc > 1. All other 

fixed-parameter problems in Table 1 that are marked NP-complete are NP- 
complete for every value fc > 0. 

On the other hand, many problems T>>(C) and T>'^{C) are “easy”. They are 
fixed-parameter tractable in a strong sense. They can be solved in polynomial 
time even without fixing k. This is indicated by marking the corresponding entries 
in Table 1 with P (for the class P) rather than with FPT. There is only one 
exception, the problem which is W[l]-complete. 



Table 1. The complexities of the problems T>>{C) and T>'^{C). 
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Small-bound problems for the cases when A = “=” or “<” can be viewed as 
problems of deciding the existence of “small” models (that is, models contain- 
ing exactly k or at most k atoms). The fixed-parameter complexities of these 
problems are summarized in Table 2. 

The problems involving the class of all purely negative programs and the 
class of all programs are W[2]-complete. This is a strong indication that they are 
fixed-parameter intractable. All problems of the form 2?< ("H) are fixed-parameter 
tractable. In fact, they are solvable in polynomial time even without fixing the 
parameter k. We indicate this by marking the corresponding entries with P. 
Similarly, the problem S'T^iTi) of deciding whether a Horn logic program P has 
a stable model of size exactly k is in P. However, perhaps somewhat surprisingly, 
the remaining two problems involving Horn logic programs and A = “=” are 
harder. We proved that the problem is W[l]-complete and that the 

problem S'P={TL) is W[l]-hard. Thus, they most likely are not fixed-parameter 
tractable. We also showed that the problem SV={'H) is in the class W[2]. The 
exact fixed-parameter complexity of SV={T-L) remains unresolved. 

Large-bound problems for the cases when A = “=” or “<” can be viewed 
as problems of deciding the existence of “large” models, that is, models with 
a small number of false atoms — equal to k or less than or equal to k. The 
fixed-parameter complexities of these problems are summarized in Table 3. 

The problems specified by Z\ = “<” and concerning the existence of models 
are in P. Similarly, the problems specified by Z\ = “<” and involving Horn 
programs are solvable in polynomial time. Lastly, the problem is in 

P, as well. These problems are in P even without fixing k and eliminating it 
from input. All other problems in this group have higher complexity and, in 
all likelihood, are fixed-parameter intractable. One of the problems, AiL{Af), is 



Table 2. The complexities of the problem of computing small models (small-bound 
problems, the cases of A = “=” and “<”). 
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Table 3. The complexities of the problems of computing large models (large-bound 
problems, the cases of A = “=” and “<”). 
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W[l] -complete. Most of the remaining problems are W[2]-complete. Surprisingly, 
some problems are even harder. Three problems concerning supported models are 
W[3]-complete. For two problems involving stable models, ST'^{A) and ST'<{A), 
we could only prove that they are W [3] -hard. For these two problems we did not 
succeed in establishing any upper bound on their fixed-parameter complexities. 

The study of fixed-parameter tractability of problems occurring in the area 
of nonmonotonic reasoning is a relatively new research topic. The only two other 
papers we are aware of are [TruOl] and [GSS99]. The first of these two papers 
provided a direct motivation for our work here (we discussed it earlier). In the 
second one, the authors focused on parameters describing structural properties 
of programs. They showed that under some choices of the parameters decision 
problems for nonmonotonic reasoning become fixed-parameter tractable. 

Our results concerning computing stable and supported models for logic pro- 
grams are mostly negative. Parameterizing basic decision problems by constrain- 
ing the size of models of interest does not lead (in most cases) to fixed-parameter 
tractability. 

There are, however, several interesting aspects to our work. First, we identi- 
fied some problems that are W[3]-complete or W[3]-hard. Relatively few prob- 
lems from these classes were known up to now [DF97]. Second, in the con- 
text of the polynomial hierarchy, there is no distinction between the problem 
of existence of models of specified sizes of clausal propositional theories and 
similar problems concerning models, supported models and stable models of 
logic programs. All these problems are NP-complete. However, when we look 
at the complexity of these problems in a more detailed way, from the perspec- 
tive of fixed-parameter complexity, the equivalence is lost. Some problems are 
W[3]-hard, while problems concerning existence of models of 2-normalized for- 
mulas are W[2]-complete or easier. Third, our results show that in the context of 
fixed-parameter tractability, several problems involving models and supported 
models are hard even for the class of Horn programs. Finally, our work leaves 
three problems unresolved. While we obtained some bounds for the problems 
SV={'H), 5T<(M) and ST'^{A), we did not succeed in establishing their precise 
fixed-parameter complexities. 

The rest of our paper is organized as follows. In the next section, we review 
relevant concepts in logic programming. After that, we present several useful 
fixed-parameter complexity results for problems of the existence of models for 
propositional theories of certain special types. In the last section we give proofs 
of some of our complexity results. 

2 Preliminaries 

In the paper, we consider only the propositional case. A logic program clause 
(or rule) is any expression r of the form 

r= gi,...,gr„,not(si),...,not(s„), (2) 

where p, Qi and Si are propositional atoms. We call the atom p the head of r and 
we denote it by h(r). Further, we call the set of atoms {qi , . . . , Qm, si, . ■ • , s„} 
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the body of r and we denote it by b{r). We distinguish the positive body of r, 
{qi, . . . , Qm} {b~^{r), in symbols), and the negative body of r, {si, . . . , s„} (b~{r), 
in symbols). 

A logic program is a collection of clauses. For a logic program P, by At{P) 
we denote the set of atoms that appear in P. If every clause in a logic program 
P has an empty negative body, we call P a Horn program. If every clause in P 
has an empty positive body, we call P a purely negative program. 

A clause r, given by (2), has a propositional interpretation as an implication 

pr{r) = qi A . .. Aqm A “'Si A ... A ~>Sn ^ P- 

Given a logic program P, by a propositional interpretation of P we mean the 
propositional formula 

pr{P) = f\{pr{r)\ r G P}. 

We say that a set of atoms M is a model of a clause (2) if M is a (propositional) 
model of the clause pr{r). As usual, atoms in M are interpreted as true, all 
other atoms are interpreted as false. A set of atoms M C At{P) is a model of a 
program P if it is a model of the formula pr(P). We emphasize the requirement 
M C At{P). In this paper, given a program P, we are interested only in the 
truth values of atoms that actually occur in P. 

It is well known that every Horn program P has a least model (with respect 
to set inclusion). We will denote this model by lm{P). 

Let P be a logic program. Following [Cla78], for every atom p G At{P) we 
define a propositional formula comp{p) by 

comp{p) = p ^ \Z{c(r): r € P, h{r) = p}, 



where 

c(r) = /\{q- q G b+{r)} A /\{^s: s G b~{r)}. 

If for an atom p G At{P) there are no rules with p in the head, we get an empty 
disjunction in the definition of comp{p), which we interpret as a contradiction. 
We define the program completion [Cla78] of P as the propositional theory 

comp{P) = {comp{p):p G At{P)}. 

A set of atoms M C At{P) is a supported model of P if it is a model of the 
completion of P. It is easy to see that if p does not appear as the head of a rule 
in P, p is false in every supported model of P. It is also easy to see that each 
supported model of a program P is a model of P (the converse is not true in 
general) . 

Given a logic program P and a set of atoms M, we define the reduct of P 
with respect to M (P^, in symbols) to be the logic program obtained from P 

by 

1. removing from P each clause r such that Mnb~{r) yf 0 (we call such clauses 
blocked by M), 
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2. removing all negated atoms from the bodies of all the rules that remain (that 
is, those rules that are not blocked by M). 

The reduct is a Horn program. Thus, it has a least model. We say that M 
is a stable model of P if M = lm{P^). Both the notion of the reduct and that 
of a stable model were introduced in [GL88] . 

It is known that every stable model of a program P is a supported model 
of P. The converse does not hold in general. However, if a program P is purely 
negative, then stable and supported models of P coincide [Fag94]. 

In our arguments we use fixed-parameter complexity results on problems 
to decide the existence of models of prescribed sizes for propositional formulas 
from some special classes. To describe these problems we introduce additional 
terminology. First, given a propositional theory by At{(P) we denote the set 
of atoms occurring in <P. As in the case of logic programming, we consider as 
models of a propositional theory only those sets of atoms that are subsets of 
At{<P). Next, we define the following classes of formulas: 

tN: the class of t-normalized formulas (if t = 2, these are simply CNF formulas) 
2 N^: the class of all 2-normalized formulas whose every clause is a disjunction 
of at most three literals (clearly, 2Ns is a subclass of the class 2N) 
tNM: the class of monotone t-normalized formulas, that is, t-normalized formu- 
las in which there are no occurrences of the negation operator 
tNAi the class of antimonotone t-normalized formulas, that is, t-normalized for- 
mulas in which every atom is directly preceded by the negation operator. 

Finally, we extend the notation Ma(C) and to the case when C stands 

for a class of propositional formulas. In this terminology, AiL{3NM) denotes the 
problem to decide whether a monotone 3-normalized formula has a model in 
which exactly k atoms are false. Similarly, Ai ={tN) is simply another notation for 
the problem WS[t] that we discussed above. The following theorem establishes 
several complexity results that we will use later in the paper. 

Theorem 1. (i) The problems M^{2N) and M^{2NM) ore W[2] -compfete. 

(a) The problems Ai^{2N 3 ) and Ai = {2NA) are W[l]~ eomplete. 

(Hi) The problem M'^(3N) is W [3] -complete. 

Proof: The statements (i) and (ii) are proved in [DF97]. To prove the statement 
(iii), we use the fact that the problem M<{3N) is IF [3] -complete [DF97]. We 
reduce A4<{3N) to A4<(5A) and conversely. Let us consider a 3-normalized 
formula VJ^i Afci where x[i,j,£] are literals. We observe 

that <P has a model of cardinality at most k if and only if a related formula 
x[i,j,^\, obtained from by replacing every negative literal 
-•x by a new atom x and every positive literal a; by a negated atom -la;, has a 
model of cardinality at least |At(^)| — k. This construction defines a reduction 
of M<{3N) to M'^{3N). It is easy to see that this reduction satisfies all the 
requirements of the definition of fixed-parameter reducibility. 

A reduction of Ai'^{3N) to A4<{3N) can be constructed in a similar way. 
Since the problem M<{3N) is W[3]-complete, so is the problem AI<(5A). □ 




Fixed-Parameter Complexity of Semantics for Logic Programs 205 

In the proof of part (iii) of Theorem 1, we observed that the reduction we 
described there satisfies all the requirements specified in Definition 1 of fixed- 
parameter reducibility. Throughout the paper we prove our complexity results 
by constructing reductions from one problem to another. In most cases, we only 
verify the condition (4) of the definition which, usually, is the only non-trivial 
part of the proof. Checking that the remaining conditions hold is straightforward 
and we leave these details out. 

3 Some Proofs 

In this section we present some typical proofs of fixed-parameter complexity 
results for problems involving existence of models, supported models and stable 
models of logic programs. Our goal is to introduce key proof techniques that we 
used when proving the results discussed in the introduction. 

Theorem 2. The problems MTiTL) and ML(A) are W [2] -complete. 

Proof: Both problems are clearly in W[2] (models of a logic program P are models 
of the corresponding 2-normalized formula pr{P)). Since "H C to complete 
the proof it is enough to show that the problem J\4L(TL) is W[2]-hard. To this 
end, we will reduce the problem M. = {2NM) to A4L{'H). 

Let ^ be a monotone 2-normalized formula and let /c > 0. Let {xi, . . . ,Xn} 
be the set of atoms of <P. We define a Horn program P$ corresponding to as 
follows. We choose an atom a not occurring in and include in P,p all rules of 
the form ^ a, z = 1, 2, . . . , n. Next, for each clause C = V . . . V of ^ 
we include in P,p the rule 



rc = a ^ Xii, . . . ,Xip. 

We will show that has a model of size k if and only if P 4 , has a model of size 
\At{P,p)\ — {k 1) = {n 1) — (k 1) = n — k. 

Let M be a model of ^ of size k. We define M' = {xi , . . . , x„} \ M. The set 
M' has n — k elements. Let us consider any clause rc G P-i> of the form given 
above. Since M satisfies C, there is j, I < j < p, such that Xi^ ^ M' . Thus, M' 
is a model of rc- Since a ^ M', M' satisfies all clauses Xi ^ a. Hence, M' is a 
model of P 4 ,. 

Conversely, let M' be a model of P 4 , of size exactly n — fc. If a G M' then 
Xi G M', for every z, I < z < n. Thus, |M'| = n-\- \ > n — k, a contradiction. 
Consequently, we obtain that a ^ M'. Let M = {xi, . . . , x„}\M'. Since a ^ M' , 
\M\ = k. Moreover, M satisfies all clauses in Indeed, let us assume that there 
is a clause C such that no atom of C is in M. Then, all atoms of C are in M' . 
Since M' satisfies rc, a G M', a contradiction. Now, the assertion follows by 
Theorem 1 . □ 



Theorem 3. The problem M = is W[l]-complete. 
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Proof: We will first prove the hardness part. To this end, we will reduce the prob- 
lem Ai={2NA) to the problem Ai=('H). Let <P be an antimonotone 2-normalized 
formula and let fc be a non-negative integer. Let ag,. . . ,ak be A: -I- 1 different 
atoms not occurring in For each clause C = -ixi V ... V -'Xp of <P we define a 
logic program rule rc by 



rc = ao xi,. . . ,Xp. 

We then define P,p by 

P<p = {rc- C G<P}U {ui ^ aj: i, j = 0,1, . . . , k, j}. 

Let us assume that M is a model of size k of the program P<p. If for some i, 
0 < i < k, ai £ M then {ao,...,afc} C M and, consequently, \M\ > fc, a 
contradiction. Thus, M does not contain any of the atoms a^. Since M satisfies 
all rules rc and since it consists of atoms of only, M is a model of ‘P (indeed, 
the body of each rule rc must be false so, consequently, each clause C must be 
true). Similarly, one can show that if M is a model of ‘P then it is a model of 
P$. Thus, W[l]-hardness follows by Theorem 1. 

To prove that the problem is in the class W[l], we will reduce it to 

the problem To this end, for every Horn program P we will describe 

a 2-normalized formula <Pp, with each clause consisting of no more than three 
literals, and such that P has a model of size k if and only if <Pp has a model of 
size (fc -I- 1)2^ -I- k. Moreover, we will show that can be constructed in time 
bounded by a polynomial in the size of P (with the degree not depending on k). 

First, let us observe that without loss of generality we may restrict our at- 
tention to Horn programs whose rules do not contain multiple occurrences of 
the same atom in the body. Such occurrences can be eliminated in time linear 
in the size of the program. Next, let us note that under this restriction, a Horn 
program P has a model of size k if and only if the program P' , obtained from 
P by removing all clauses with bodies consisting of more than k atoms, has a 
model of size k. The program P' can be constructed in time linear in the size of 
P and k. 

Thus, we will describe the construction of the formula <Pp only for Horn 
programs P in which the body of every rule consists of no more than k atoms. 
Let P be such a program. We define 

B = {B: B C b{r), for some r £ P}. 

For every set B £ B we introduce a new variable u[B], Further, for every atom 
x in P we introduce 2^ new atoms x[i], i = 1, . . . , 2*. 

We will now define several families of formulas. First, for every x £ At{P) 
and z = 1, . . . , 2^ we define 

D{x,i) = X x[i] (or (- 1 X V a;[z]) A (a; V -ia;[z])), 
and, for each set B £ B and for each x £ B, we define 

E{B,x)= X A u[B \ {x}] ^ u[B] (or -icc V -'■u[P \ {x}] V ^[P]). 
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Next, for each set B € B and for each x € B we define 

F{B, x) = u[B] X (or -•u[B] V x). 

Finally, for each rule r in P we introduce a formula 

G(r) = u[b{r)] h{r) (or -iu[&(r)] V h{r)). 

We define <Pp to be the conjunction of all these formulas (more precisely, of 
their 2-normalized representations given in the parentheses) and of the formula 
m[ 0]. Clearly, <Pp is a formula from the class 2 N 3 . Further, since the body of each 
rule in P has at most k elements, the set B has no more than |P|2^ elements, each 
of them of size at most k {\P\ denotes the cardinality of P, that is, the number 
of rules in P). Thus, <Pp can be constructed in time bounded by a polynomial 
in the size of P, whose degree does not depend on k. 

Let us consider a model M oi P such that \M\ = k. We define 

M' = MU{x[{\:xe M,i= 1, . . . ,2^} U {u[B]: B C M}. 

The set M' satisfies all formulas D{x,i), x £ At{P), i = 1, ... ,2^. In addition, 
the formula u[0] is also satisfied by M' (0 C M and so, u[0] G M'). 

Let us consider a formula E{B, x), for some B £ B and x £ B. Let us assume 
that X A u[B \ {x}] is true in M' . Then, x G M' and, since x G At{P), x G M. 
Moreover, since u[B \ {x}] G M', B \ {x} C M. It follows that B C M and, 
consequently, that u[B] £ M' . Thus, M' satisfies all “P-formulas” in 

Next, let us consider a formula F{B,x), where B £ B and x £ B, and let 
us assume that M' satisfies u[B], It follows that B C M. Consequently, x G M. 
Since M C M', M' satisfies x and so, M' satisfies F{B,x). 

Lastly, let us look at a formula G(r), where r £ P. Let us assume that 
w[6(r)] G M' . Then, h{r) C M. Since r is a Horn clause and since M is a model 
of P, it follows that h{r) £ M. Consequently, h{r) £ M' . Thus, M' is a model 
of G{r). 

We proved that M' is a model of ^p. Moreover, it is easy to see that \M'\ = 
k + k 2 ^^ + 2 ^^ = {k + l) 2 ^ + k. 

Conversely, let us assume that M' is a model of <?p and that \M'\ = (fc -|- 
1)2^ -I- k. We set M = M' n At{P). First, we will show that M is a model of P. 
Let us consider an arbitrary clause r £ P, say 

r = h ^ bi, . . . ,bp, 

where h and bi, 1 < i < p, are atoms. Let us assume that {61 , ... , bp} C M. We 
need to show that h £ M. 

Since {61 , . . . , bp} = b{r), the set { 61 , . . . , bp} and all its subsets belong to B. 
Thus, ‘Pp contains formulas 

E{{bi , . . .,h-i},bi) = biA u[{bi, . . .,h-i}] ^ u[{bi , . . . ,6i_i,6J], 

where i = 1, . . . ,p. All these formulas are satisfied by M' . We also have u[0] G Pp. 
Consequently, m[ 0] is satisfied by M', as well. Since all atoms bi, I < i < p, are 
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also satisfied by M' (since M C M'), it follows that u[{bi, . . . , bp}] is satisfied 
by M'. 

The formula G(r) = u[{bi, . . . , bp}] h belongs to Thus, it is satisfied 
by M' . It follows that h € M'. Since h € At{P), h € M. Thus, M is a model of 
r and, consequently, of the program P. 

To complete the proof we have to show that ]M] = k. Since M' is a model of 
(pp, for every x G M, M' contains all atoms x[i], 1 < i <2^ . Hence, if |M| > fc 
then |M'| > ]M] + ]M] x 2^ > (fc + 1)(1 + 2*) > (fc + 1)2^ + fc, a contradiction. 

So, we will assume that ]M] < k. Let us consider an atom u[B], where B G B, 
such that u[B] G M' . For every x G B, <Pp contains the rule F{B, x). The set M' 
is a model of F{B, x). Thus, x G M' and, since x G At(P), we have that x G M. 
It follows that B C M. It is now easy to see that the number of atoms of the form 
u[B] that are true in M' is smaller than 2^. Thus, |M'| < ]M] + ]M] x 2* + 2^ < 
(/c— l)(l + 2^) + 2^ < (fc+l)2^ + fc, again a contradiction. Consequently, ]M] = k. 

It follows that the problem can be reduced to the problem M={2N^). 

Thus, by Theorem 1, the problem A4=('H) is in the class W[l]. This completes 
our argument. □ 

Theorem 4. The problem SP={A) is in VF[2]. 

Proof: We will show a reduction of SV^{A) to A4={2N), which is in W[2] by 
Theorem 1. Let P be a logic program with atoms xi,. .. ,x„. We can identify 
supported models of P with models of its completion comp{P). The completion 
is of the form comp{P) = Pi A ... A where 

TTLi IFlij 

P, = Xi<^\J f\ x[i,j,e], 

j=ii=i 

i = 1, . . . ,n, and x[i,j,t] E^re literals. It can be constructed in linear time in the 
size of the program P. 

We will use eomp{P) to define a formula Pp. The atoms of Pp are xi, . . . ,Xn 
and u[i,j], i = ^, ■ ■ ■ ,n, j = 1, . . . , rui. For t = 1, . . . , n, let 

rrii rrii 

Gi = Xi^\J u[i,j], (or -iXi V \J u[i,j]), 

i=i i=i 

rrii rrii 

G'i = V u[i,j] ^ Xi, (or f\{x, V -^u[i,j])), 
t=i i=i 

rrii — 1 rrii 

H^ = A A {-^u[i,j] V -^u[i,f]), for every i such that mi > 2, 
i=i f'=i+i 

rrii rriij rrii ‘^ij 

j=i 1=1 j=it=i 
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rrii 'nT'ij mi rriij 

j=i t=i i=i t=i 

The formula is a conjunction of the formulas written above (of the for- 
mulas given in the parentheses, to be precise). Clearly, <l>p is a 2-normalized 
formula. We will show that comp{P) has a model of size k (or equivalently, that 
P has a supported model of size k) if and only if <Pp has a model of size 2k. 

Let M = {xpi , . . . , Xp^.} be a model of comp{P). Then, for each i = pi, . . . ,pk, 
there is j, I < j < nii, such that M is a model of Afci x[i,j,£] (this is because 
M is a model of every formula We denote one such j (an arbitrary one) by 
ji- We claim that 



M' = ML) {u[i,j^] : i = pi, . . . ,pk} 

is a model of ^p. Clearly, Gi is true in M' for every f, 1 < i < n. If ^ M then 
u[i,j] ^ M' for all j = 1, . . . ,m,i. Thus, G[ is satisfied by M' . Since for each i, 
I < i < n, there is at most one j such that u\i,j] € M' , it follows that every 
formula is true in M' . By the definition of jp if u[i,j] G M' then j = ji and 
M' is a model of Afci j) ^]- Hence, li is satisfied by M' . Finally, all formulas 
Ji, ^ < i ^ n, are clearly true in M' . Thus, M' is a model of ^p of size 2k. 

Conversely, let M' be a model of <?p such that \M'\ = 2k. Let us assume that 
M' contains exactly s atoms u\i,j]- The clauses Pli ensure that for each i, M' 
contains at most one atom u[i,j\- Therefore, the set M'n{u[t, j]: i= 1, . . . , n j = 
1, . . . ,mj is of the form {m[pi, jpj, . . . ,u[ps,jpJ} where pi< ... <pg. 

Since the conjunction of Gi and G' is equivalent to Xi T 

follows that exactly s atoms Xi belong to M' . Thus, \M'\ = 2s = 2k and s = k. 
It is now easy to see that M' is of the form {xp^ ,■■■ , Xp ^ , u[p\,jpi], ■ ■ ■ , u[pk,jpkW- 
We will now prove that for every i, 1 < i < n, the implication 

Xi=^ \J /\ x[i,j,£] 

]=ie=i 

is true in M' . To this end, let us assume that Xi is true in M' . Then, there is 
j, 1 < j < rrii, such that u[i,j] G M' (in fact, i = pt and j = jp_i., for some t, 
1 < t < k). Since the formula li is true in M' , the formula Afci x[i,j,£] is true 
in M'. Thus, the formula V^i f\T=i is true in M', too. 

Since for every i, 1 < i < n, the formula Ji is true in M' , it follows that 
all formulas <l>i are true in M' . Since the only atoms of M' that appear in the 
formulas <Pi are the atoms . . . , , it follows that M = {xp^ . . . , Xp^ } is a 

model of comp{P) = <P\ f\ . . . /\ <P>n- 

Thus, the problem SP^{A) can be reduced to the problem Ai={2N), which 
completes the proof. □ 

For the problem SP= {A) we also established the hardness result — we proved 
that it is W[2]-hard (we omit the proof due to space restrictions). Thus, we found 
the exact location of this problem in the W-hierarchy. For the problem SP'^{A), 
that we are about to consider now, we only succeeded in establishing the lower 
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bound on its complexity. We proved it to be W[3]-hard. We did not succeed in 
obtaining any non-trivial upper estimate on its complexity. 

Theorem 5. The problem 5T<(-4) is W[3]-/iard. 

Proof: We will reduce the problem M'^{3N) to the problem ST'<{A). Let 

m rrii 'rnij 

l\\l f\ x[i,jA] 

i=lj=ll=l 

be a 3-normalized formula, where x[i,j, are literals. Let u[l], . . . , u[m], u[l], . . . , 
ri[2fc+l] be new atoms not occurring in <P. For each atom x G we introduce 

new atoms a;[s], s = 1, . . . ,k- 

Let P,p be a logic program with the following rules: 

Mx, y, s) = x[s] ^ not(j/[s]), x,y e At(^), x^y, s = 1, . . . , fc, 

B{x) = X x[l],x\2], ... ,x[k], X G At{<P), 

C{i,j) = u[i] ^ x'[i,j, l],x'[i,j,2],...,x'[i,j,mij], i = 1, . . . ,m, j = 1, . . . ,mi, 
where 

= if =x 

^ ‘ ^ not(x) if a;[i, j, €] = -ix, 

D{q) = u[g] ■u[l],'u[2], . . . , u[m], q = 1, . . . ,2k + 1. 

Clearly, \At{P,p)\ = nk + n + m + 2k + 1, where n = \At{(!>)\. We will show 
that has a model of cardinality at least n — k ii and only if P<|> has a stable 
model of cardinality at least \At{P$) \ —2k = n{k + 1) + to + 1. 

Let M = At{<l>) \ {xi , . . . , Xk\ be a model of <P, where x\,. . . ,Xk are some 
atoms from At{T>) that are not necessarily distinct. We claim that M' = At{P,p) \ 
{x\, . . . ,Xk,xi[l], . . . ,Xk[k\} is a stable model of Pg>. 

Let us notice that a rule A{x, y, s) is not blocked by M' if and only iiy = Xs- 
Hence, the program P^ consists of the rules: 

x[l] ^ , for a; yf Xi, 

a;[2] ^ , for x ^ x^ 



x[k] -fr- , for X ^ Xk 

X -1^ x[l],x[2], . . . ,x[k], X & At{<P) 

v[q] ^ m[1],m[2],...,m[to], q=l,...,2k + l, 

and of some of the rules with heads u[i]. Let us suppose that every rule of 
with head u[i] contains either a negated atom a; G M or a non-negated atom 
X ^ M. Then, for every j = l,...,TOi there exists 1 < £ < to^ such that 
either x[i,j,t] = ~^x and x G M, or x[i,j,£] = x and x ^ M. Thus, M is not 
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a model of the formula Kt=i and, consequently, M is not a model 

of <?, a contradiction. Hence, for every i = 1, . . . ,m, there is a rule with head 
u[i] containing neither a negated atom x € M nor a non-negated atom x ^ M . 
These rules also contribute to the reduct . 

All atoms x[s] yf xi[l],X2[2 ], . . . ,Xk[k] are facts in . Thus, they belong 
to lm{P^ ). Conversely, if x[s] G lm{P^ ) then x[s] yf xi[l],X2\2\, . . . ,Xk[k\. 
Moreover, it is evident by rules B{x) that x G lm{P^ ) if and only if x yf 
Xi,X 2 , ■ ■ ■ ,Xk- Hence, by the observations in the previous paragraph, u[i] G 
lm{P^'), for each i = Finally, v[q] G lm{P^'), q = 1, . . .2k + 1, 

because the rules D{q) belong to the reduct P^ . Hence, M' = lm{P^ ) so M' 
is a stable model of and its cardinality is at least n{k -I- 1) -I- m -I- 1. 

Conversely, let M' be a stable model of P,/, of size at least \At{P 4 ,)\ — 2k. 
Clearly all atoms u[g], q = 1, . . . ,2k + 1, must be members of M' and, conse- 
quently, u[i] G M', for i = 1, . . . , TO. Hence, for each i = 1, . . . , to, there is a rule 
in P.p 

u[i] ^ x'[i,j, l],x'[i,j,2],...,x'[i,j,mij] 

such that x'[i,j,i] G M' if x'[i,j,P\ = x, and x'[i,j,i] ^ M' if x'[i,j,P\ = -ix. 
Thus, M' is a model of the formula V^i Ki=i for each i = 1, . . . ,to. 

Therefore M = M' fl At{<P) is a model of ‘P. 

It is a routine task to check that rules A(x, y, s) and B(x) imply that all 
stable models of P,p are of the form 



At{P,p) \ {X1,X2, . . . ,Xfc,Xi[l],X2[2], . . . ,Xfc[fc]} 

(xi, X 2 , ... ,Xk are not necessarily distinct). Hence, \M\ = \M' fl At{(I>)\ > n — k. 

We have reduced the problem M'^{3N) to the problem 5T<(M). Thus, the 

assertion follows by Theorem 1. □ 
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Abstract. In this paper, we propose an extension of the well-founded 
and stable model semantics for logic programs with aggregates. Our 
approach uses Approximation Theory, a fixpoint theory of stable and 
well-founded hxpoints of non-monotone operators in a complete lattice. 
We dehne the syntax of logic programs with aggregates and define the 
immediate consequence operator of such programs. We investigate the 
well-founded and stable semantics generated by Approximation Theory. 
We show that our approach extends logic programs with stratified aggre- 
gation and that it correctly deals with well-known benchmark problems 
such as the shortest path program and the company control problem. 



1 Introduction 

Programs with aggregates have been investigated in the context of database 
applications [1,14,17,20]. One important approach to programs with aggregates 
is Monotone aggregation [17,20]. This approach concentrates on programs where 
the 2-valued immediate consequence operator is monotone with respect to some 
given lattice order. In such cases, one can take the least fixpoint as the intended 
model of the program. Another approach is by using stratification. Aggregate 
stratified programs [1,14] can be split up in a sequence of different strata such that 
each aggregate expression only refers to predicates defined in previous strata. A 
disadvantage of both approaches is that they impose serious syntactic restrictions 
on the programs. Therefore for many problems, one has to carefully tune the 
program in such a way that it satisfies the syntactic restrictions. For this reason, 
a number of attempts have been made for extending the well-founded semantics 
[21] and the stable semantics [10] for unstratified programs with aggregates. 

Well-founded semantics is a natural semantics for deductive databases. As 
shown in [2,4] it captures a general principle of non-monotone inductive defini- 
tion. For this reason, [3] used the well-founded semantics as the semantic prin- 
ciple of ID-logic, a logic for knowledge representation which integrates classical 
logic assertions and inductive definitions. ID-logic can be seen as an extension 
and provides an epistemological foundation for Abductive logic Programming 
[11]. Also the stable semantics has been shown to be important for knowledge 
representation, in particular for nonmonotonic reasoning. Since recently, it is 
used as the foundation of an emerging paradigm of stable logic programming [13] 
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where problems are solved by computing stable models of a logic program. Since 
recently, aggregates are attracting increasing interest in the context of these 
extensions. Experiments in solving combinatorial search problems showed that 
many of these problems cannot be adequately expressed without aggregation. 
Recently, one of the prominent systems for stable logic programming smodels 

[15] has been extended successfully with a limited form of aggregate constraints 

[16] . [22] describes an extension of an abductive system with aggregates and 
some experiments with this system in the context of scheduling and puzzles. 

As will be shown in section 5, current extensions of well-founded semantics 
[12,18,19] and stable model semantics [12,16] of programs with aggregates are 
still weak in some ways. In this paper, we propose a stronger extension of the 
well-founded and stable model semantics for logic programs with aggregates. Our 
approach uses Approximation Theory developed in [5] . With each nonmonotone 
operator, this theory associates a Kripke-Kleene, a Well-founded and a set of 
stable fixpoints. In the case of the immediate consequence operator of a logic 
program, these fixpoints identify exactly the models in the corresponding types 
of Logic Programming semantics. We define the syntax of logic programs with 
aggregates and define an immediate consequence operator of such programs. We 
investigate the well-founded and stable semantics generated by Approximation 
Theory, and show this approach extends logic programs with stratified aggrega- 
tion and that it correctly deals with recursive benchmark problems such as the 
shortest path program and the company control problem. An important prop- 
erty is that if the 2-valued immediate consequence operator is monotone then its 
ultimate well-founded model coincides with the least fixpoint of this operator. 

2 Aggregate Logic Programs: Syntax and Tp 

A sorted signature If is a set of sort symbols and sorted function, predicate and 
variable symbols. The arity of a function symbol is a tuple (si, .., s„) : s where 
si, .., s„, s are sort symbols; the arity of a predicate symbol is a tuple (si, .., s„). 
We introduce a set Agg C 17 of sorted second order predicate symbols. These 
will be called the aggregate symbols. The arity of a symbol F of Agg is specified 
as a tuple (ri,..,r„) where Ti is either a tuple (si,..,Sm) specifying the sort 
of a relation argument, a tuple (si,..,Sm) : s specifying the sort of a function 
argument or a first order sort symbol s. Sort symbols nat, int and real denote 
the natural, respectively integer and real numbers. 

Example 2.1. The following aggregate symbols are used below: 

— Cards', has arity {{s),nat) denotes the cardinality relation of sets of sort s. 

— Mirig, MaXg. have arity ((s), s) and denote the minimality, respectively max- 
imality relation of sets of the sort s representing a partial order. 

— Sum(^si,..,sn)' has arity ((si, .., s„), (si, .., Sn) : nat, nat) and denotes the sum 
function that summates a function over a set. I.e. for a relation R, function 
F and integer n, Sum{R, F,n) holds iff n = X)(a;i a:„)eK 

R is finite. 
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All these aggregate symbols are indexed by the sorts over which they range. 
In particular, Agg may contain many cardinality, minimality, maximality or 
summation aggregates, each ranging over other sorts. However, in the sequel, we 
will drop the annotations with sorts and will write Card, Min, Max, Sum, ... 

A signature S is interpreted over a S -structure I; this is an assignment of: 

— a set Ds to each sort symbol s, 

— a function // : x .. x Dg^ — 1 Dg to each function symbol / with arity 

(si , . ., Sji^ . s, 

— arelationp/ C Dg^y. ..xDg^ to each predicate symbolp with arity (si, .., s„). 

— an appropriately typed second order relation F/ to each aggregate symbol 
F G Agg. 

Next we define the syntax of aggregate formulas and aggregate programs 
based on A as a restricted subset of second order formulas. We define the notions 
of lambda expression and of set expression, aggregate atom and aggregate formula 
by simultaneous induction: 

— A lambda expression is of the form A(Ai, .., A„)t where Xi, .., are differ- 
ent variables and t is a first order term. Variables appearing in t and not 
among Xi, ..,X„ are called the free variables of the lambda expression. 

— A set expression is of the form {(Ai, .., A„)|^} where again Ai,..,A„ are 
different variables and (j) an aggregate formula. Variables appearing in (j) and 
not among Xi , .., A„ are called the free variables of the set expression. 

— An aggregate atom is a well-typed second order atom where each function 
argument contains a lambda expression, each relation argument contains a 
set expression and each other argument contains a term. 

— An aggregate formula is either a (first order) atom, an aggregate atom or is 
a composed formula obtained by applying the normal composition rules for 

Variables of aggregate atoms are first order, hence quantification is never over 
second order variables. Also, a set expression which is part of an aggregate atom 
in turn contains an aggregate formula, hence, nested aggregation is allowed. 

Example 2.2. The formula 

/ {{U, C)\bucket{U) A Meaking{U) A capacity {U,C)},\ 

yr.Sum f x(u,c)c, j ^ t > looo 

expresses that the sum of the capacity of non-leaking buckets is at least 1000. The 
first argument denotes the set of pairs of a non-leaking bucket and its capacity 
while the second argument represents the projection function. The argument U 
(the bucket) in the set and lambda expression is necessary as a capacity has 
to be counted as many times as there are non-leaking buckets with it. Using a 
function capac mapping buckets on their capacity, an alternative formulation is: 

/ {{U)\bucket{U) A Meaking{U)} 

VT.S'wm f \{U)capac{U), 1 ^ T > 1000 




Well-Founded and Stable Semantics for Logic Programs with Aggregates 215 



Its meaning is: VT.(T = ^ capac{U)) T > 1000 

U ^{U\hucket{U)^—'leaking{U)'\ 

Aggregate expressions have been denoted in different ways. The most com- 
mon (e.g [12]) is of the form group-by{p{X ,Y),[X],C = F{t[X,Y])) where X 
are the grouping variables and C is the result of computing the aggregate func- 
tion F. It is equivalent in our notation to: F({y | p{X,Y)},X{Y)t[X,Y],C). For 
example, the atom group-by{d{A, B, M),[A], N = Sum{M)) is written in the 
syntax defined here as Sum{{{B,M) \ d{A,B,M)},X{B,M)M,N) and denotes 
that N = J2(B,M)e{{B,M) I d{A,B,M)} 

Next we define aggregate programs. Let i7 be a sorted signature of symbols 
and S' = E \J Xd an extension of S with a set S^ of sorted predicate symbols, 
called the defined predicate symbols. An aggregate program based on S' is a 
set of rules of the form p{t\, ..,t„) ^ B where p is a defined predicate, ti, ..,t„ 
are appropriately typed first order terms and B is an aggregate formula based 
on S' . The resulting syntax extends the standard logic programming syntax by 
allowing aggregate expressions and general formulas in the body of rules. 



2.1 Examples 



Example 2.3 (Shortest path). Given are two sorts edge and weight and an inter- 
preted predicate symbol edge of arity {edge, edge, weight) representing a directed 
weighted graph. The defined predicate sp has the same sort; sp{X, Y, Z) means 
that the shortest path from X to Y has total weight W. It is defined as: 



sp{X, Y, W) ^ Min 




edge{X,Y,W)\/ 
3Z,Wi,W2.{sp{X, Z,Wi)A 
edge{Z, Y, W 2 ) AW = Wi + W 2 ) 




Example 2.) (Company control). A sort comp represents companies and a sort 
shares interpreted over the real interval [0..1] represents shares. The interpreted 
predicate ownsstock has arity {comp, comp, shares) and ownsstock{X,Y, S) 
means that a company X owns a fraction S of the stock of a company Y . The 
defined predicate Controls has arity {comp, comp); Controls{X, Y) denotes that 
company X has a controlling interest in a company Y. This is the case if X 
owns (by its own shares in Y augmented with the shares of other companies Z 
controlled by X) more than 50% of the stock of Y . It is defined as: 



Controls{X,Y) <— 



/ 



Sum 



{Z,S) 



V 

AT > 0.5 



owns Mock{X , Y, S) A Z = X\/ 
controls{X, Z) A ownsstock{Z, Y, S) 
X{Z,S)S, 

T 



\ 

/ 
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2.2 Immediate Consequence Operator 

First, we give a standard definition of truth assignment of aggregate formulas. 
Given is a signature E and a if-structure I. A variable assignment ct is a (well- 
typed) mapping from the variables of E to domain elements. For a given variable 
assignment <t, a tuple of variables X = {Xi, ..,X„) and a tuple of domain ele- 
ments X = {x \, .., Xn), we denote by a[X jx\ the variable assignment a' such that 
a'{X) = a{X) for each variable X not in X, and a'{Xi) = Xi. We define satisfi- 
ability 1= of a formula in a structure and the evaluation operator evalj^a which 
maps first order terms to domain elements, lambda expressions to functions and 
set expressions to relations. The definition is by simultaneous induction: 

— evalia-{E) = a{X) 

— .., t„)) = fi{evalj^a{ti), •■, evali^„{tn)) _ 

— evali^^{X{X)t) = {(x,y) \ (x,y) € Ds^x..xDs^xDs a,nd y = evalj^^[X /x]{t)} 
where (si, .., Sn) ■ s is the arity of the lambda expression. 

— evali^a{{X\<p}) = {5;| x G x .. x and I, a[X /x\ ^ (/>} where (si, .., s„) 
is the arity of the set expression. 

— for each first order or aggregate atom F(ti,..,t„), \= F(ti,..,t„) iff 

(evali^crih), ..,evali^a{tn)) G F/ 

— the standard truth recursion for A, V, — >■, ... 

Given is a signature E' = EUEd (symbols of E {Ej) are called the interpreted 
symbols (respectively the defined predicate symbols)), we assume the existence 
of a A-structure Iq- It may interpret some of the sorts from E over the integers 
or reals. If E contains the standard arithmetic function and predicate symbols 
such as -k, — , < they are interpreted in the standard way. Other sorts represent 
user-defined sorts and are interpreted appropriately. Aggregate symbols are also 
part of the signature E and are interpreted by some fixed domain independent 
relation (e.g. the minimum relation over a poset, the cardinality of a set, sum, 
product, etc.). An analogy with database terminology can be drawn here: defined 
predicates correspond to intensional database predicates (IDB) while interpreted 
predicates correspond to extensional database predicates (EDB). 

Let P be a A'-aggregate program and be the set of all A'-structures 
extending and interpreting the defined symbols. With these definitions, the 
immediate consequence operator of P can be defined in a straightforward way. 

Definition 2.1 (2-valued Tp operator for aggregate programs). We de- 
fine the immediate consequence operator Tp : — >■ of P as the operator 

mapping an interpretation I to Tp{I) = I' such that for each defined symbol p, 
pi> = {(xi, .., Xn) I there is a rule p{t \, .., t„) ^ B in P and a variable assignment 
a such that evali^r,{t \, .., t„) = (xi, .., x„) and I,a \= B}. 

Typically, the intended interpretation of a logic program P is given by one or 
some subset of fixpoints of the Tp operator. The semantics defined in the next 
sections is a fixpoint semantics. For each pair I\,l 2 G Xp , define I\ A I 2 iff for 
each defined symbol p, pj^ C pj^ . A well-known phenomenon is that aggregates 
lead to Tp operators that are nonmonotone with respect to this order, even when 
the program p doesn’t contain a negation symbol. 
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Example 2.5. Consider the program: p{a) ^ Card{{X \ p{X)}, N) A N < 2. 

The Tp of this program is non-monotone as it maps any interpretation of p 
with at least two elements to an interpretation assigning the empty set to p. The 
unique fixpoint of Tp interprets p by {a}. This seems a natural solution. 

For aggregate programs with a C-monotone Tp operator, Tp has a least 
fixpoint. As argued in [12], it is a natural solution for the semantics of P^. 

Example 2.6. Reconsider the company control program from Example 2.4. In 
the natural situation where the sort shares is interpreted with positive numbers, 
the Tp is monotone. Indeed, if we add new Controls facts to the structure, then 
the result of summing the shares of controlling companies can only increase and 
hence more Controls facts can be derived. It is well-known that the least fixpoint 
of this operator is the intended solution of this problem. 

The examples above illustrate cases where the intended interpretation corre- 
sponds to a minimal fixpoint of the immediate consequence operator. In general, 
nonmonotone operators may have no or multiple (minimal) fixpoints. For pure 
logic programs, not all minimal fixpoints are intended interpretations. This can 
also happen with aggregate programs: 

Example 2. 1. Consider the program: 

p{l5) ^ Card{{X \p{X)},l) 9 ^ -'L'(O) 

This program has two fixpoints. One in which p is interpreted by the single- 
ton 0, and q is false, and another in which p is empty and q is true. Both are 
minimal, but clearly only the second is intended. The first fixpoint is not ac- 
ceptable because p(0) only depends positively on itself, hence, the model is not 
well-founded. 

Approximation theory [5,6] is a fixpoint theory that assigns fixpoints to any 
nonmonotone operator in a complete lattice and was used to describe the seman- 
tics of logic programs, default logic and autoepistemic logic. The next section 
recalls its basics and uses it to define a semantics for aggregate programs. 

3 Approximation Theory 

Let (L, <) be a complete^ lattice with least element T and largest element T. The 
bilattice of (L, <) is the structure (L^,<,<i) where we define (x,y) < (xi,yi) 
iff a: < Xi and y < yi, and (x,y) <i (xi,yi) iff a; < and y > yi. Both < and 
<i are complete lattice orders in L^. The order < has least element (T,T) and 

^ Note that the approach of monotone aggregation is far more general than the case 
of programs with a C-monotone Tp operator. Indeed, in monotone aggregation, 
programs are considered for which Tp is monotone with respect to some user defined 
order. Such operators are not necessarily IZ-monotone. 

^ Each subset has a least upper bound and a greatest lower bound. 
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largest element (T, T) whereas <i has least element (_L, T) and largest element 

(T,±). 

In approximation theory, the intuition underlying bilattice elements (x, y) is 
that they are approximations of lattice elements. In particular (x, y) approxi- 
mates a lattice element z x < z < y. The elements x and y constitute re- 
spectively an underestimate and overestimate of z. The set of lattice elements 
approximated by (x,y) is denoted [x,y]. Note that this interpretation is only 
possible for tuples (x, y) such that x < y. We call such tuples consistent. In- 
consistent tuples do not approximate any lattice element ([x,y] is empty). The 
order <i is called the information order; (x,j/) <i (xi,j/i) implies that (xi,j/i) is 
a more precise approximation than (x,j/) (in particular, [x,y] 3 [xi,j/i]). Note 
that for tuples (x,x), [x,x] = {x}. Therefore, the tuples of identical elements 
constitute the natural embedding of L in L^. 

We define a framework for studying the fixpoints of any operator on a com- 
plete lattice by investigating the fixpoints of approximations. 

Definition 3.1. Let O : L ^ L he an operator on L. An approximation A : 
LF' of O is a <i~monotone operator such that: 

— A{x,x) = (0(x),0(x)), that is, A coincides with O on the embedding of L 

in LF. 

— A is symmetric, that is, if A{x,y) = {x\,yi) then A{y,x) = {yi,x\). 

Every operator A \ LF ^ LF can be defined by a unique pair A\ , A 2 of 
functions of type LF ^ L such that for each (x, y), A{x, y) = (^^(x, y),Af^{x, y)). 
It is straightforward to see that A is an approximation of some operator iff (I) 
for each pair (x,y), A^{x,y) = A^{y,x) and (2) A^ is monotone in its first 
argument and antimonotone in its second argument. Vice versa, each operator 
A^ ■. LF ^ L which is monotone in its first and anti-monotone in its second 
argument can be used to construct an approximation A. A maps a tuple (x,y) 
to the tuple {A^{x, y),A^{y,x)) and approximates the operator O : L ^ L where 
for each x, 0(x) = A^{x,x). 

Let A be an approximation of O. Then A has a least fixpoint KK(A) in the 
information order <j, called the Kripke-Kleene fixpoint of A. The Kripke-Kleene 
fixpoint approximates each fixpoint of O. This is not good enough if we want to 
approximate minimal fixpoints of O (or some subset of them). 

In order to obtain better approximations, we define a stable operator ST a '■ 
L ^ L. Observe that with each lattice element x, we can associate an L — >■ L 
operator, denoted A^{.,x) which maps elements z to A^{z,x). The operator 
A^{.,x) is monotone and has a least fixpoint. Define ST a{x) = lfp{A^{.,x)). 
ST A is an anti-monotone operator. Fixpoints of ST a are called stable fixpoints 
of A. It can be shown that they are minimal fixpoints of O. The extended stable 
operator ST A '■ L"^ ^ LF maps {x,y) to {ST A{y), ST a{x)) and is <i-monotone. 
It has a <i-least fixpoint WF{A), called the well-founded fixpoint of A. 

An operator O may have many different approximations. It was investigated 
in [6] how the fixpoints of these different approximations relate and how ap- 
proximations can be constructed from O. Let be the restriction of LF to the 
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consistent pairs. We say that an approximation Ai is less precise than approxi- 
mation A 2 and denote by Ai <i A 2 if for each (x,y) € Ai(x,y) <i A 2 {x,y). 
The following properties have been proven: 

Proposition 3.1. [6] Let Ai <j A 2 . Then: (1) A stable fixpoint of Ai is a stable 
fixpoint of A 2 - (2) The Kripke-Kleene and well-founded fixpoint of Ai are less 
precise (w.r.t <i) than those 0 /A 2 . 

Two important observations can be derived. First, two approximations co- 
inciding on have the same stable, Kripke-Kleene and well-founded fixpoints. 
The behaviour of the approximation on the inconsistent elements is not relevant! 
Hence, it suffices to define an approximation only on the consistent elements. 

Definition 3.2 (Partial Approximation). An operator A : L'^ ^ is a 

partial approximation of an operator O : L ^ L if A is <i-monotone and 
A{x,x) = {0{x),0{x)) for each lattice element x. 

It can be proven that any partial approximation can be extended to an ap- 
proximation defined on L^. Hence, the sequel focuses on partial approximations. 

Improving an approximation yields increasing sets of stable fixpoints and 
more precise Kripke-Kleene and well-founded fixpoints. There must be a limit 
to this process. One can construct a most precise partial approximation: 

Definition 3.3 (Ultimate Approximation). LetO{[x,y]) = {0{z)\z G [x,y]} 
The ultimate partial approximation Ult{0) : — >■ of O : L ^ L is: 

Ult{0){x,y) = {glb{0{[x,y])),lub{0{[x,y]))) 

It is easy to show that Ult{0) is <i monotone, coincides with O on the 
pairs (x,x), and is more precise than any other partial approximation of O. 
Moreover, its fixpoints are determined only by O. Thus, for each operator O on 
(L,<), we define the ultimate stable (respectively ultimate Kripke-Kleene and 
ultimate well-founded) fixpoints of O as the stable (respectively Kripke-Kleene 
and well-founded) fixpoints of its ultimate partial approximation Ult{0). 



3.1 Approximations of Logic Programs 

Approximation theory was used in [7] for describing the semantics of Logic Pro- 
gramming, Default logic and Autoepistemic logic. In the case of Logic Pro- 
gramming, the underlying lattice is the lattice of Herbrand interpretations. The 
bilattice corresponds to 4-valued interpretations. In particular, any pair (/, J) 
of 2-valued interpretations defines the following truth function: 

— p is true in (/, J) iff p G / fl <7; i.e. under- and overestimate agree on the 
truth of p. 

— p is false in (/, J) iff p ^ / U J; i.e. under- and overestimate agree on the 
falsity of p. 
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— p is undefined in (/, J) iff p G J \ I', i-e. underestimate of p is false and 
overestimate is true. 

— p is inconsistent in (I,J) iff p G I \ J'i i-e. underestimate of p is true and 
overestimate is false. 

The consistent bilattice elements {I, J) correspond exactly to the 3-valued in- 
terpretations (since I \ J is empty) . 

In terms of the bilattice representation of 4-valued interpretations, the four- 
valued immediate consequence operator 7p of a program P [9] can be defined as 
the operator that maps any pair (/, J) to {Tp{I, J),Tp{J, !))■ Here, the operator 
Tp{I, J) is defined as the 2- valued interpretation I' such that a ground atom p 
is true in I' iff there exists a ground instance of a ground rule p B such that 
each positive literal of B is true in / and each negative literal is true in J . 

The main result about the relationship between approximation theory and 
LP-semantics is that the 4-valued immediate consequence operator 7p of a pro- 
gram P is an approximation of the 2-valued immediate consequence operator 
Tp, that the stable operator of P is the stable operator of 7p and hence that 
the Kripke-Kleene, stable and well-founded models of P are the Kripke-Kleene, 
stable and well-founded fixpoints of 7p. 

In general, 7p is not the ultimate approximation^ of Tp, and consequently, 
the ultimate fixpoints of Tp are not the Kripke-Kleene, stable and well-founded 
models of P. Below are shown some important relationships between the stan- 
dard models and the ultimate versions. Let T be a logic program. 

Proposition 3.2. (a) The well-founded (resp. Kripke Kleene) model of P is 
less precise than the ultimate well-founded (resp. Kripke Kleene) model of P. (b) 
If the well-founded model of P is 2-valued then it is the ultimate well-founded 
model of P. (c) If Pi, P 2 are two different programs such that their 2-valued 
immediate consequence operators coincide, then their ultimate Kripke-Kleene, 
ultimate well-founded and ultimate stable model coincide. 

Example 3.1. A simple example of a logic program for which the standard and 
ultimate semantics differ is the following: 



P\ = {v^p p^^p] 

Its Kripke-Kleene and well-founded fixpoint is ({}, {p}) (i.e. p is undefined) and 
it has no stable models. On the other hand, note that the 2-valued immediate 
consequence operators of this program is monotone and that it coincides with 
the immediate consequence operator of the program 



P2 = {p^) 

® It can be seen easily that the 3-valued ultimate approximation is obtained by eval- 
uating rule bodies, not according to the standard 3- valued truth function but using 
the technique of supervaluations. In this scheme, a formula is true(false) in a 3- valued 
interpretation I if it is true(false) in each 2-valued interpretation approximated by 
I. Otherwise, the formula is undefined. 
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Hence, they have the same ultimate models. Since the Kripke-Kleene model of 
P2 is the 2-valued interpretation {p}, this is also the ultimate Kripke-Kleene, 
ultimate well-founded and the unique ultimate stable model of Pi . 

3.2 Ultimate Semantics for Programs with Aggregates 

Given is a signature S' = SUSd, a i7-structure lo for the interpreted symbols S, 
and an aggregate program with defined predicates Sd- The structure (T/„, E) of 
U'-structures extending lo is a complete lattice. The operator Tp is an operator 
in this lattice. Hence, we can apply ultimate approximation theory. 

Definition 3.4. The ultimate Kripke-Kleene, ultimate well-founded and ulti- 
mate stable models of P are the ultimate Kripke-Kleene, ultimate well-founded 
and ultimate stable fixpoints ofTp. 

4 Analysis and Resnlts 

In this section, we investigate the ultimate semantics of aggregate programs. 
Everywhere in this section, we assume the existence of a given signature S, the 
U-structure lo and an aggregate program P based on S' = SU Sd- 

The first result shows that if the immediate consequence operator is mono- 
tone, then the ultimate well-founded model is the least fixpoint of this operator. 

Theorem 4.1. [6] If Tp is Q-monotone then its ultimate well-founded and 
unique stable fixpoint are equal to the least fixpoint ofTp. 

Note that this property is not satisfied by the well-founded semantics. Example 
3.1 illustrates this. The 2-valued Tp operator of the program Pi is constant and 
hence monotone, yet its least fixpoint is not the well-founded model of Pi. 

Aggregate stratified programs are studied in [1,14]. P is an aggregate strat- 
ified program iff the body B of any rule is a conjunction of atoms, negative 
literals and aggregate atoms. Moreover, for each predicate symbol p there is a 
level number ip such that for each rule p •<— B, if predicate symbol q occurs in 
an atom of B then iq < ip, if q appears in a negative literal in H or in a set 
expression in an aggregate atom in B, then iq < ip. In other words, predicate 
symbols appearing negatively or in an aggregate atom in the body of a rule 
defining p must be defined in lower levels. 

The next theorem expresses that the ultimate well-founded (and stable) se- 
mantics of aggregate programs generalises the perfect model semantics of aggre- 
gate stratified programs. 

Theorem 4.2. If P is a aggregate stratified program then its ultimate well- 
founded model is its perfect model [1,14]. 

Another theorem shows that each aggregate program has a natural transla- 
tion into a ground, infinitary logic program, such that the ultimate well-founded 
model of the aggregate program is the well-founded model of its translation. 
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Let Bo be the set {p{x\, € Sd and x\, ..,Xn G L?si x .. x Ds„ (with 

(si, s„) the arity ofp) }. For any 2-valued interpretation I extending Iq and any 
set B of positive or negative literals of Bo, define that / |= i? iff (xi, x„) G pi 
iip{xi,..,Xn) G B and (xi,..,x„) ^pi if x„) G B. 

Definition 4.1. Define lp{P) as the (infinitary) ground logic program based on 
Bo and consisting of all ground clauses p{x\, ..,x„) ^ B for which there exists 
a variable assignment a and a rule p{t\, ..,t„) ^ F in P such that: 

- p{xi, ,.,Xn) = p{evali^o{ti), ..,evali^o{tn)) and 

— B is any set of literals of Bo such that for each 2-valued S -interpretation I 
extending lo-' if I \= B then I,a \= F. 



Theorem 4.3. The ultimate well-founded model of P is the well-founded model 
oflp{P). 

Finally, we return to the examples of Section 2.1. Since the immediate conse- 
quence operator of company control program is monotone (if shares are positive), 
the following proposition follows. 

Proposition 4.1. For any finite structure lo interpreting ownsstock and inter- 
preting the sort shares by positive real numbers, the ultimate well-founded model 
is 2-valued and is the unique stable model. Moreover, a fact Controls{ci, C 2 ) be- 
longs to this model iff a company c\ controls a company C 2 . 

The ultimate well-founded model of the shortest path problem correctly rep- 
resents shortest path distances. 

Proposition 4.2. Let lo be a structure for the shortest path program Pgp defin- 
ing a weighted directed graph with positive weights. Then the ultimate well- 
founded model of Psp is 2-valued and contains a fact sp{x,y,w) iff there exists a 
path between x and y with total weight w and the total weight of all other paths 
between x and y is greater or equal to w. 

5 Related Work 

The following toy-problems illustrate some issues in the relationship between 
ultimate well-founded semantics and monotone aggregation [17]. 

In the example below, p/1 is a predicate interpreted over the domain {0, 1, 2}. 

p{X) ^ Min{{Y\Y = 1 V p{Y)}, X) 

Intuitively, the intended meaning is pi = {p(l)|. However, note that pi> = 
{p(0)| is a fixpoint too. Note that the immediate consequence operator of this 
program is not monotone with respect to the standard order C on interpretations. 
However, it is monotone with respect to other orders. In approaches of monotone 
aggregation, one must find such an appropriate order. In monotone aggregation. 
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only structures that assign a singleton set to p are considered. In this collection 
of structures, one can define the following order: / < J if it holds that if p/ = {x} 
and pj = {y}, then x < y. Tp is monotone with respect to this order. However, 
its minimal fixpoint assigns pi = {0}. Tp is also monotone with respect to the 
inverse order >, and its least fixpoint in this order is the intended solution pi = 
{!}. This shows that the selection of the right order is important in monotone 
aggregation. Consider now the following extension: 

p{X) ^ Min{{Y\Y = 1 V p(Y)}, X) 

r{X) ^ Min{{Y\Y = 1 V r{Y)}, Z)KX = 2-Z 

The immediate consequence operator is not longer monotone with respect to >. 
The above program has the same Tp operator as the following program (obtained 
by removing redundant rules from lp(P)): 

p(0) ^ p(0) r(2) ^ r(0) 

p(l) ^ ~'p(0) r(l) <— -t(0) 

This program is stratified and has a 2-valued well-founded model {p(l),r(l)}. 
It is the ultimate well-founded model of the original program. 

The above examples suggest that the approach developed in this paper might 
be both more general and simpler than monotone aggregation. A formal investi- 
gation of the relationship between ultimate semantics and monotone aggregation 
needs to be conducted. 

Kemp and Stuckey [12] define a well-founded semantics of logic programs with 
aggregates. However, their definition doesn’t deal well with positive recursion 
through aggregation. Hence, their semantics is strictly weaker than the ultimate 
well-founded semantics defined in this paper. For example, in the well-founded 
model of [12] of the following program 

p(0) ^ Card({Xjp(X)}, 1) 

the atom p is undefined while it is false in the ultimate well-founded model. 

In [18] was proposed another extension of the well-founded and valid se- 
mantics for programs with aggregates. The extension is more in the spirit of the 
ultimate well-founded semantics. The authors rely on having aggregate functions 
defined for 3-valued multisets and satisfying certain properties of monotonicity. 
However, we have shown in this paper (the definition of ultimate approximations) 
that for defining 3-valued (or even 4-valued) semantics of programs with aggre- 
gates one can still work with aggregate functions defined on standard sets. Fur- 
ther investigation is needed about the precise relation of these two approaches. 

Dix and Osorio [8] have also observed that the standard well-founded seman- 
tics is not strong enough for programs with aggregation. They have pointed out 
that the WFS+ (introduced independently by Dix and Schlipf) is well-suited, 
however its complexity is on the first level of the polynomial hierarchy. In their 
paper [8] they propose two weaker extensions of the well-founded semantics 
WFS^ and WFS^ which are polynomially computable. The common property 
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for these two semantics is that a is true for a program containing the single rule 
a •<— -'a. For this program, a is undefined in the ultimate well-founded semantics. 
Whether the ultimate well-founded semantics is weaker in general than WFS^ 
and WFS^ is an open question. 

[12] also defines a stable semantics of programs with aggregates where the 
aggregate atoms are treated like negative literals during the stability transfor- 
mation. As the authors have pointed out, this has the effect that a program 
may have non minimal stable models. In the ultimate well-founded and stable 
semantics for aggregate programs, such problems do not appear. 

Another extension of the stable model semantics for aggregate programs 
[16] was done in the context of the smodels system [15]. The authors consider 
weight constraints of the form L < {li = wi,...,/„ = w„} < U where h 
are literals and Wi are real numbers representing weights. Such constraint is 
true in an interpretation if the sum of the weights of the literals satisfied by 
the interpretation is between L and U. For details about how stable models of 
programs with weight constraints are computed we refer to [16]. 

To show the relationship with our semantics we can translate a weight con- 
straint to an aggregate formula of the form 

3S.Sum{{I\ ?iA/=lV...Vl„A/ = n}, X{I)w{I), S)AL<SAS<U 

where w{I) is a new interpreted function which corresponds to the weights Wi. 
For programs with weight constraints only in the body we can show that stable 
models are ultimate stable models of the aggregate program obtained by apply- 
ing the above transformation. In most cases the inverse will also hold but not 
always, as Example 3.1 shows. The advantage of using a special form of aggre- 
gate expressions, like weight constraints, is that more efficient algorithms can 
be developed. In the case of the smodels system, the complexity of computing 
stable models remains in the same class as for logic programs without aggregates 
[16]. The reason for this is the use of a less precise approximation underlying 
smodels. 

In [3], ID-logic is defined as a logic extending first order logic with (general 
nonmonotone inductive) definitions. Each definition can be thought of as one 
logic program defining a subset of predicates. Using the ideas presented here, 
one can extend ID-logic with aggregates. A theory in this extended logic consists 
of aggregate formulas and aggregate programs as defined here. A model of such 
a theory is a 2-valued interpretation I that satisfies all aggregate formulas and 
is an ultimate well-founded model of each of its aggregate programs.^ 

6 Conclusion and Future Work 

In this paper we have used the Approximation Theory to develop a well-founded 
and stable model semantics for programs with aggregates. We do this by defin- 
ing a 2- valued Tp operator for programs with aggregates in an intuitive way. 

^ Note that an aggregate program may have many ultimate well-founded models if 
the interpretation lo is not fixed but may vary. 
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Approximation theory then shows how to define a most precise approximation 
of this operator, called an ultimate approximation. We argued that the well- 
founded and stable fixpoints of this ultimate approximation are well suited for 
programs with aggregation. In particular, if the Tp operator of a programs is 
monotone then the ultimate well-founded model is 2-valued and coincides with 
the least fixpoint of the Tp operator. 

Another important topic for further research is to investigate the complexity 
of computing the ultimate well-founded semantics for programs with aggregates. 
We expect that if we restrict to Datalog programs, we will obtain exponential 
complexity in terms of the number of ground atoms. This is because a one step 
computation of the ultimate approximation on a 3-valued interpretation (/, J) 
requires to apply the Tp operator to all possible 2- valued interpretations /' such 
that / C /' C J. If the size of the Herbrand base is n then for the first step, there 
are 2" possible interpretations. However, Approximation Theory is an interesting 
setting for reducing the complexity. Indeed, note that the ultimate well-founded 
model is approximated by the well-founded model of any approximating opera- 
tor. Less precise alternative approximation operators with lower complexity can 
be searched for such that their well-founded fixpoints coincide with the ultimate 
well-founded model for certain collections of well-behaved programs. An example 
where this is possible is the class of aggregate stratified programs. 
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Abstract. We formally characterize alternating fixed points of boolean 
equation systems as models of (propositional) normal logic programs. To 
this end, we introduce the notion of a preferred stable model of a logic 
program, and define a mapping that associates a normal logic program 
with a boolean equation system such that the solution to the equation 
system can be “read off” the preferred stable model of the logic program. 
We also show that the preferred model cannot be calculated a-posteriori 
(i.e. compute stable models and choose the preferred one) but rather 
must be computed in an intertwined fashion with the stable model itself. 
The mapping reveals a natural relationship between the evaluation of 
alternating fixed points in boolean equation systems and the Gelfond- 
Lifschitz transformation used in stable-model computation. 

For alternation-free boolean equation systems, we show that the logic 
programs we derive are stratified, while for formulas with alternation, the 
corresponding programs are non-stratified. Consequently, our mapping of 
boolean equation systems to logic programs preserves the computational 
complexity of evaluating the solutions of special classes of equation sys- 
tems (e.g., linear-time for the alternation-free systems, exponential for 
systems with alternating fixed points). 



1 Introduction 

Model checking [2,15,3] is a verification technique aimed at determining whether 
a system specification possesses a property expressed as a temporal logic formula. 
Model checking has enjoyed wide success in verifying, or finding design errors 
in, real-life systems. An interesting account of a number of these success stories 
can be found in [4] . 

Model checking has spurred interest in evaluating alternating fixed points as 
these are needed to express system properties of practical import, such as those 
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involving subtle fairness constraints. Probably, the most canonical temporal logic 
for expressing alternating fixed points is the modal mu-calculus [14,9], which 
makes explicit use of the dual fixed-point operators /i (least fixed point) and 
V (greatest fixed point). A variety of temporal logics can be encoded in the 
mu-calculus, including Linear Temporal Logic (LTL), Computation Tree Logic 
(CTL) and its derivative CTL*. 

Fixed-point operators may be nested in mu-calculus formulas and different 
fixed-point formulas may be mutually dependent on each other. Alternating 
fixed-point formulas are those having a least fixed point that is mutually depen- 
dent on a greatest fixed point. 

Recently, it has been demonstrated that logic programming (LP) can be 
successfully applied to the construction of practical and versatile model check- 
ers [16,7]. Central to this approach is the connection between models of temporal 
logics and models of logic programs. For example, the XMC model checker [17] 
verifies an alternation-free modal mu-calculus formula by evaluating the perfect 
model of an equivalent stratified logic program. While the relationship between 
models of alternating modal mu-calculus formulae and stable models of logic 
programs has been conjectured [11], there has been no formal characterization 
of this connection. Establishing this relationship is the focus of this paper. 

The model-checking problem for the modal mu-calculus can be formulated in 
terms of solving Boolean Equation Systems (BESs): see [19,12] and Appendix A 
of this paper. A BES is a system of mutually dependent equations over boolean- 
valued variables, where each equation is designated as a greatest or least fixed 
point. To characterize the solutions of BESs in terms of models of logic programs, 
we introduce the notion of preferred stable models of normal logic programs, and 
describe a mapping from BESs to propositional normal logic programs such 
that the solution to a BES can be obtained from the preferred stable model of 
the corresponding logic program. The mapping also ensures that alternation- 
free BESs are mapped to stratified logic programs, and is thus a conservative 
extension of the mapping used by the XMC system. This preserves the linear- 
time complexity of model checking alternation-free formulas. 

Preferred answer sets [18,1] have been defined as a way to capture priority 
and preference in knowledge representation. They are defined for a more gen- 
eral class of logic programs including disjunctive logic programs, possibly with 
different flavors of negation (efault, classical, etc.). When restricted to normal 
logic programs, preferred answer sets differ from our definition of preferred stable 
models. For example, in contrast to the notion of preference used in [18], the so- 
lution to a BES cannot be found by simply imposing a selection criterion on the 
stable models of an equivalent logic program (see Example 2.8). Nevertheless, 
the exact relationship between the solution of a BES and the different preferred 
answer set semantics proposed in the literature remains to be fully explored. 

The rest of this paper develops along the following lines. Section 2 presents 
an informal introduction to BESs and their relationship to logic programs; these 
ideas are formalized in Section 3. Section 4 introduces the notion of a preferred 
stable model of a normal logic program, while Section 6 formally establishes 
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the relationship between solutions to BESs and preferred stable models. Our 
concluding remarks are offered in Section 7. Due to space limitations, proofs are 
omitted or sketched. Full proofs can be found in [10]. The paper also contains 
an appendix (Appendix A) reviewing the standard connection between model 
checking in the modal mu-calculus and solving BESs. 

2 Overview 

2.1 Boolean Equation Systems 

A Boolean Equation System (BES) is a sequence of fixed-point equations over 
boolean variables, with an associated sequence of signs (sign map) that specifies 
the polarity of the fixed points. The z-th equation is of the form Xi = ai where 
ai is a positive boolean formula over variables {Xi,X2, ■ ■ ■} and constants 0 and 
1. The z-th sign, CTj, is fx if the z-th equation is a least fixed-point equation and v 
if it is a greatest fixed-point equation. We use {X\ = ai, A2 = a2, ■ • • , Ai„ = a„) 
to denote the sequence of equations in a BES of size n, and {oi,a2, ■ ■ ■ , cfn) to 
denote its associated sign map. In a BES of size rz, X\ is called the innermost 
variable (and the equation X\ = ai the innermost fixed point), and A„ is called 
the outermost variable. A BES of size n is said to be closed if all variables 
occurring in ai for all 1 < z < n are drawn from {Xi, X2, ■ ■ ■ , AT„}. 

In the following, we use 4> to range over BESs, and S (possibly subscripted) 
for specific BESs. Let ^ be a BES (Ai = oi, A2 = 02, • ■ • , A„ = a„). We use 4>i 
to denote the subsystem {X\ = oi, A2 = a2 , . . . , Aj = ai). Thus 4> = <j)n, and 
<po denotes the empty BES ( ). 

In the following, we use a series of examples to informally describe the se- 
mantics of a BES and the relationship between BESs and logic programs; these 
ideas are formalized in subsequent sections of the paper. 

A solution of a BES is a truth assignment to the variables {Ai,A2,...} 
satisfying the fixed-point equations such that outer equations have higher pri- 
ority over inner equations. More precisely, a solution of a BES (Ai = Oi, A2 = 
a2, ■ ■ ■ , X„ = a„) is a valuation that is a fixed point for the outermost variable 
A„ and is “minimal” (closest to 0 if cr„ = fi, and closest to 1 if = v) among 
all solutions for the subsystem (Ai = ai, A2 = a2, ■ ■ ■ , A„_i = a„_i). 

Example 2.1. Consider the BES = (Ai = Ai A A2, A2 = Ai V A2) with sign 
map We first consider all solutions to the subsystem (Ai = Ai A A2). 

The fixed points for the subsystem are (Ai = 0, A2 = 0), (Ai = 0, A2 = 1) and 
(Ai = 1, A2 = 1). A2 is free in the subsystem so we need to consider solutions 
to the subsystem independently of the value of A2. For A2 = 0, there is only one 
fixed point and hence (Ai = 0, A2 = 0) is a solution. Among the two fixed points 
corresponding to A2 = 1, the least (since CTi = p.) is (Ai = 0, A2 = 1). Hence the 
solutions for the subsystem are (Ai = 0,A2 = 0) and (Ai = 0, A2 = 1). Both 
of these valuations are fixed points for A2 = Ai V A2, but (Ai = 0, A2 = 0) is 
the least fixed point and (since (J2 = fx) the solution to £\. □ 
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Example 2.2. The signs of both equations in E\ were identical; for a more com- 
plex example, consider the BES £2 = = X\ t\ X2, ^2 = X\\l X2,X^ = 

X3 A X2) with sign map {pL,p.,v). Following the evaluation of £\, it is easy to 
see that the solutions of the subsystem {Xi = Xi f\ X2,X2 = Xi V X2) are 
{Xi = 0,^2 = 0,^3 = 0) and \Xi = 0,^2 = 0,^3 = 1). Of these, only 
{Xi = 0,X2 = 0, X3 = 0) is a fixed point for Ais = X3 A X2 and hence is the 
solution for £2. □ 

In £2, the inner subsystem’s solutions were independent of the values assigned 
to the outer variable X3. This property does not hold in general, as shown by 
the following example. 

Example 2.3. Consider the BES £z = {X\ = Xi f\ X2,X2 = XiV X2) with sign 
map {v, p). We first consider all solutions for the subsystem {Xi = X1AX2). As in 
£\, the fixed points of the subsystem are (A"i = 0, X2 = 0), {Xx = 0, ^2 = 1) and 
{Xi = 1, X2 = 1). The only fixed point having X2 = 0 is (Xi = 0, X2 = 0) and 
is a solution. Among the two fixed points corresponding to X2 = 1, the greatest 
(since cti = ly) is {Xi = 1,^2 = 1). Hence the solutions for the subsystem are 
{Xi = 0,Ar2 = 0) and {Xi = 1 ,X2 = 1). Both valuations are fixed points for 
X2 = Xi^J X2, but {Xi = 0^X2 = 0) is the least fixed point and (since CT2 = p) 
the solution to £3. □ 

Nesting and Alternation in BES: We say that Xi depends on Xj if ai contains 
a reference to Xj, or to Xk such that Xk depends on Xj. A BES is said to be 
nested if there are two variables Xi and Xj such that Xi depends on Xj and 
CTi ^ Oj. We say that Xi and Xj are mutually dependent if Xi depends on Xj 
and vice versa. A BES is alternation free if whenever Xi and Xj are mutually 
dependent, Ui = Oj. Otherwise, the BES is said to contain alternating fixed 
points. For instance, the BES £1 has no nested fixed points, £2 has nested fixed 
points but is alternation free, while £3 has alternating fixed points. Note that 
every BES that has alternating fixed points is also nested. 

The order of equations in a nested BES is important, as the following example 
shows. 

Example 2.4. Consider the BES £4. = {X\ = X2^ X\,X2 = X2 /\ Xi) with sign 
map {p, v). Note that £4 differs from £3 only in the order in which the equations 
are defined (and the corresponding change variable names). Valuations (Xi = 
0 , X2 = 0) and {Xi = 1, V2 = 1) are solutions for the subsystem Xi = X2V Xi; 
among these only {Xi = 1, X2 = 1) is a fixed point for X2 = V2 A Vi, and hence 
is the solution for £4. □ 

2.2 Boolean Equation Systems as Logic Programs 

A normal logic program over a set of propositions .4 is a set of clauses of the 
form 7 •<— /3 where 7 G .A and /3 is a boolean formula in negation normal form 
over .4U{0, 1}. We use 1 and 0 to denote true and false, respectively. In a clause 
of the form 7 ^ /3, 7 is called the head of the clause and f3 its body. We use 
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p and q (possibly subscripted) to denote propositions and P to denote normal 
logic programs. A definite logic program is a program where every clause body is 
a positive boolean formula. We say that a proposition p uses another proposition 
g in a program if q appears in the body of a clause with p as the head; and p 
negatively uses qii q appears in the scope of a negation. A program is said to 
be stratified if no cycle in the transitive closure of the uses relation contains two 
literals p and q such that p negatively uses q. 

We use stable models as the semantics of normal logic programs [8]. Sta- 
ble models coincide with the standard least-model semantics for definite logic 
programs and the perfect-model semantics for stratified logic programs. 

A BES consisting only of least fixed points (and hence not nested) can readily 
be seen as equivalent to a definite propositional logic program. We can thus use 
logic-program evaluation techniques to find the solution to such a BES. 

Example 2.5. Consider the propositional program P\ = {p\ ^ pi A p 2 , P 2 ^ 
Pi V p2}- This program is equivalent to BES Si where pi represents Xi and p2 
represents X2. The least model for Pi is {}, from which we can derive (Ai = 
0, A2 = 0) as the solution for Si. □ 

For a nested but non-alternating BES we can construct a stratified proposi- 
tional logic program such that the solution of the BES can be obtained from the 
perfect model of the logic program. This approach requires that greatest fixed- 
point equations be converted to least fixed-point equations using the equivalence 
i/X.(j) = -ifj,Z.-t(j)[-'Z/X] (see e.g. [12]). In fact, this is the strategy deployed by 
the XMC model checker. 

Example 2 . 6 . Consider the propositional program P2 = {pi Pi A p2. P2 ^ 
Pi \/ P2- P3 ^ <?3 ^ <?3 V -'P2'}- This program is equivalent to the BES 

S2 letting Pi represent Xi. The perfect model for P2 is {(^'3} from which we can 
derive {Xi = 0, A2 = 0,^3 = 0) as the solution for S2. □ 

However, for a BES with alternating fixed points, this translation yields a 
non-stratified logic program which may not have a unique stable model. 
Example 2 . 1 . Consider the propositional program P3 = {pi ^ ~'qi- gi ^ V 
~'P2- P2 ^ Pi Vp2-}- This program is equivalent to the BES S^ where pi represents 
Xi. There are two stable models for P3: {pi,P2}> and which yields two 
candidates {Xi = 1, A2 = 1) and {Xi = 0, A2 = 0) as solutions to S^. □ 

The problem with the translation from BESs to normal logic programs in- 
formally described by the above examples lies in the fact that a BES defines an 
ordered set (sequence) of equations, and the ordering is lost in the corresponding 
normal logic program. In fact it is easy to see that the program P3 also corre- 
sponds to the BES Si where p2 represents Xi and pi represents A2: one of P3’s 
stable models corresponds to the solution of S3 and the other to the solution of 

^ 4 . 

At first sight, it appears that one can simply “select” the appropriate stable 
model by applying the order information after the stable-model computation. 
The following example illustrates why such an a posteriori selection will not 
work. 
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Example 2.8. Consider BES = (Xi = ATzAXa, X2 = AX3, X3 = X2 AX3) 
with sign map {v,p,,v). The corresponding logic program is {p\ -fr- -151,(71 <— 
~'P2 V 53, P2 ^ ~'<li A -153,^3 ^ -153, 53 ^ “ip2 V 53}. The stable models for this 
program are {_Pi,_P2)P3} and {51, 53}, which correspond to solutions V\ = {Xi = 
1,X2 = l,Xs = 1) and V2 = {Xi = 0,^2 = 0, AI3 = 0), respectively. Of these, 
v\ assigns the value 1 to X^ (the outermost variable) and appears to be minimal 
since (T3 = v. However, the solution to BES £5 is V2 since v\ is not a solution to 
the subsystem {X\ = X2 /\ X^, X2 = X\ /\ X^). □ 

Hence the “minimal” stable model may not correspond to the solution of a 
BES. Rather one must take into account the order information in the BES that 
is lost in the translation to logic programs. In Section 4 we define the notion 
of preferred stable models where information on ordering of literals is taken into 
account in the definition of the model itself. 



3 Solutions to Boolean Equation Systems 

Let X = {Xi,X2 , . . .} be the set of variables. The set of positive boolean formulas 
over X is given by the following grammar: 

a := Xi & X \ Oil /\a2 \ aiy a2 

A valuation is a map v : x ^ {0) 1} with 0 standing for false and 1 for 
true. Let V denote the set of valuations. Given a positive boolean formula a 
and a valuation v, a[v] denotes the boolean value obtained by evaluating a using 
valuation v. Given a valuation v and a boolean value a, the valuation v[a/Xi] 
is the valuation that returns the same value as v for all Xj other than Xi and 
returns a for Xi. 

The solution of a boolean equation system denoted by |^], is defined as 
a function that maps valuations to valuations. The mapping is such that |^i’](u) 
depends on v only for the free variables of (j>. Thus, for a closed system | ] defines 
a constant function. 

We first consider finding solutions to 4>i, where ai = p,. Consider a function 
/ parameterized by a valuation v defined as f{v) = \x.ai[v[x/ Xi]]. Since evalu- 
ating a formula w.r.t. a valuation results in a boolean value, f(y) maps booleans 
to booleans. Now, the least fixed point of f{v) (taken w.r.t. the natural partial 
order on {0, 1} with 0 less than 1), denoted by lfp(/(w)), gives the smallest 
value for Xi such that the fixed-point equation p : Xi = ai holds for the given 
valuation v. Note that the value assigned by v to Xi is immaterial since f{v) 
considers only v[x/Xi]. Let valuation v' be such that for all inner variables Xj, 
j i, v'{Xj) are the fixed points of the equations in 4>i+i when the outer vari- 
ables Xk, k > i, are substituted by v'{Xk). Then lip{f{v')) is the fixed-point 
value of Xi corresponding to the values of the outer variables as specified by v' . 
Thus, |^i](u)(W), is identical to lfp(/(|^i+i](r:))). The semantics of greatest 
fixed-point equations can be explained similarly. 
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The solution of a system (j) is defined by induction on the size i of the system 
as follows: 



I</'o](f) = 

|(/)i+l](w) = 

where 



/ I</'*](f[lfp(C»+i)/-’^*+i]) if CTj+i = ^ 
1 I</'*](f[gfpte+i)A*+i]) if = V 

= \x.ai+i[l4>i\{v[x / Xi+i])] 



i > 0 



3.1 Solutions as Preferred Fixed Points 

It is useful to treat the solution to a BES as the “minimum valuation” that 
satisfies the equations in the BES. We now formalize this notion. We define a 
family of partial orders Ei, f > 1, on valuations that captures our intuition that 
least fixed-point variables take values as close to 0 as possible and greatest fixed- 
point variables take values as close to 1 as possible. Further, it also captures the 
idea that outer variables (i.e. variables with a higher index) have higher priority 
then inner variables (i.e. variables with a lower index). 

We say that a valuation u is a fixed point of the system {X\ = a\, . . . , X„ = 
an) if v(Xi) = afiv] for 1 < i < n. 

Definition 3.1 (Fixed Points of a BES). The set affixed points of a BES (j> 

with respect to a valuation v, denoted by FP{v){4>), is such that 

FP{v){(j>o) = V 

FP{v){<j)i+i) = {u\u£ FP(v){4>i) and u{Xi+i) = ai+i[u]} if i > 0 
Note that in the above definition, we ignore the signs of the equations. We 
now define a partial order on valuations based on the signs which is then used 
to select the preferred fixed point. 

For a given sign map cr, we define the following partial orders: Let the partial 
order <i over {0, 1} be defined as 0 <i 1 iff (Xi = /x and 1 0 iff CTj = v. The 

partial order Ei over valuations is defined by recursion over i as follows: 

u V u{Xi) <1 v{Xi) 

u E*+i V u{Xi+i) <i+i v{Xi+i) or u{Xi+i) = v{Xi+i) A u E* w 

We say that u fli v if u v and u ^ v. It is easy to see that Ei is a partial 

order on any set of valuations that agree at Xj for all j > i- 

Definition 3.2 (Preferred Fixed Points). The preferred fixed points of a 
BES (f) with respect to a valuation v, denoted by PFP(v){4>), is the set of valuation 
such that: 

PFP{v){fio) = V 

PFP{v){<l>i+i) = 

mindu I ue PFP{v[u{X^+i]/X^+i]){(l)i)r\FP{v){(j)^+i)}) ifi>0 

Et+i 

Observe from the above definitions that PFP{v){(f>) C FP{v){4>). Moreover, 
u{Xj) = v{Xj) for all j > {i + 1) for any u G FP{v){(j)i). Thus, Ei+i is a linear 
order (as it is a lexicographic order) on the set of valuations in FP{y){(t)i+i). 
This leads us to the following proposition: 
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Proposition 3.3. For every boolean equation system (f> and valuation v, there 
is a unique preferred fixed point, i.e., |PPP(u)(^)| = 1. In particular, when the 
system is closed, there is exactly one preferred fixed point. 

For the preferred fixed point to capture the solution of a given BES, the 
preference mine must be applied to a set of preferred fixed points of the inner 
equation. To see this, consider the following formula: 

= AT2 A Ala 
^2 = ^3 A 
X3 = ^2 A X3 

with ai = V, a 2 = fr and (T 3 = ir. It is easy to verify that the valuation v that 
assigns 1 to Xi,X 2 and X 3 is a fixed point, and is minimal w.r.t. C 3 . However, 
it is also easy to check that the solution to the BES assigns 0 to Xi, X 2 and X 3 , 
and this is the preferred fixed point according to the definition given above. 

Theorem 3.4. Let (f> be a BES of size n. Then, for all i < n, |</>i](u) is the 
preferred fixed point of (pi w.r.t. v. 

The proof follows by an easy induction on i. 



4 Preferred Stable Models of Normal Logic Programs 

Let P = {pi f3i,P2 ^ /? 2 , • • • ,Pn ^ Pn} be a logic program. A proposition 
p ^ {_Pi,_P 2 ) ■ • ■ ,Pn} is said to be free in P if there is some (3 such that p occurs 
in !3i. 

We represent a model of a logic program by a substitution that maps propo- 
sitions to truth values {0, 1}. We use wq to denote the substitution that maps 
all propositions to 0. Given a substitution w over propositions, we extend it to 
literals such that w{—'p) = ~'w{p) for every proposition p, where -lO = 1 and 
-■1 = 0. Finally, given a program P and a substitution w, the program P[w] is 
the one obtained by substituting all free propositions p in P by w{p). 

Definition 4.1 (Least Models for Definite Logic Programs). The least 
model of a definite logic program P w.r.t. a substitution w on the free propositions 
of P is defined by the following equations: 

M{{}){w) = w 

M{{pi ^ f3i}U P')lw) = M{P'){w[bi/pi]) 

where bi — lfp{Xx. f3i[M{P'){w[x/pi])]) 

The traditional least model of P under the closed world assumption is simply 
M{P){wo). 

We now recall the definition of stable model semantics for normal logic pro- 
grams. 

Definition 4.2 (Gelfond-Lifschitz Transformation [8]). The Gelfond-Lif- 
schitz transform of a propositional normal logic program P with respect to sub- 
stitution w is a program P Iw obtained by replacing every negative literal of the 
form -<p in P by ~<w{p). 
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Note that for all P and w, P Iw is a, definite logic program. A substitution 
tu is a stable model of a program P iff it is the least model of P Iw. 

Definition 4.3 (Stable Models). The set of stable models of a normal logic 
program P w.r.t. a substitution w on the free propositions of P, denoted by 
SM (P){w) , is defined as: 

SM{P){w) = {u \ u = M{P[w] I w)(tu)} 



4.1 Preferred Stable Models 

We now define stable models w.r.t. a preference sequence: a sequence S = 
{li,l2, ■ ■ ■ , Im) of literals such that no proposition appears both positively and 
negatively in S. As usual, we represent by Si the initial subsequence of S of 
length i. Given a substitution w mapping propositions to truth values, we ex- 
tend w to literals with the usual interpretation that w{—'p) = -'w{p) for some 
proposition p where -lO = 1 and -if = 0. 

Definition 4.4 (Preference Order Eg). Given two substitutions wi and W2 
and a preference sequence S = (/i, ^2, • ■ • , Im) , we say that W2 is preferred over 
wi (written as w\ Es W2) if wi{lm) < W2{lm), or wi{lm) = W2{lm) and 
wi Qsm-i W2- For an empty preference sequence S = { ) , w\ Es w>2 for any 
pair of substitutions w\ and W2 ■ 

Note that Es defines a lexicographic order on substitutions, and hence is reflexive 
and transitive. Moreover, for any pair of substitutions Wi,W2 that agree at all 
literals not in S', we have Es W2 A W2 Es w\ ^ w\ = W2- This means that 
for every set of substitutions that agree on all literals not in S, there is a unique 
minimum element w.r.t. Es- We denote this element by minc^. 

Definition 4.5 (Preferred Stable Models). The preferred stable model of a 
normal logic program P w.r.t. to a substitution w and a preference sequence S, 
denoted by PSM s{P){w) , is defined inductively on the size of P as follows: 

PSMs{{}){w) = w 
PSMsi{p^ ^ Pi}U P'){w) = 

min({M I u G PSMs{P'){w[u{pi)/pi]) n SM{{p^ ^ / 3 i}U P'){w)}) 

Es 

By PSMs{P) we denote the set of all preferred stable models w.r.t. arbitrary 
substitutions. 

It is easy to show that the above definition is well-defined in the sense that the 
value of PSM s is independent of the clause {pi ^ selected for use in the 
recursive case. 

A preference sequence S is said to be complete w.r.t. program P if every 
proposition in P appears (positively or negatively) in S. Every program that 
has at least one stable model w.r.t. to a substitution w has exactly one preferred 
stable model w.r.t. a complete preference sequence and w. Formally, 




236 



K. Narayan Kumar, C.R. Ramakrishnan, and S.A. Smolka 



Proposition 4.6 (Uniqueness of Preferred Stable Models). Let P he a 

normal logic program, w he a valuation, and S he a preference sequence that is 
complete w.r.t. P. Then \PSM s{P){w)\ < 1. Moreover \PSM s{P){w)\ = 0 ijf 
SM{P){w) = {}. In particular, for closed programs, \PSM s{P)\ < 1 and is 0 iff 
SM{P) = {}. 



5 Mapping Boolean Eqnation Systems 
to Propositional Logic Programs 



In order to map BBSs to logic programs, we first consider the mapping between 
the variables in a given BES 4> and propositions in the corresponding logic pro- 
gram P. The logic program P we derive is over propositions {pi,p 2 , . . . ,qi,q 2 , ■ . ■} 
such that each variable Xi in f corresponds to literal pi if ai = fi and to —•qi if 
CTi = L'. The idea behind the mapping is to translate the equations into clauses in 
a normal logic program, considering greatest fixed points in terms of their dual 
least fixed points. The salient aspect of the mapping we define is that negation is 
used only where absolutely necessary: negation will be used only when variables 
of differing (fixed-point) signs are related. The following function M translates 
variables of a BES to propositions in the translated logic program: 



M{Xi) 



f Pj if at = n 

\ ~^qi if (7i = v 



We lift A4 to boolean expressions by replacing the variables in a given ex- 
pression using the above definition. In order to translate greatest fixed-point 
equations, we need to find the dual of the equation. This is done by construct- 
ing a, which is -^a to negation normal form. When we apply M to boolean 
expressions with negation, we also reduce expressions of the form -i-ip to p. 



Definition 5.1. The translation function V maps boolean equation systems to 
normal logic programs such that for a boolean equation system (j> is given 
by 



nM = {} 

■PU. i - / ^ M{a^+i)} U V{(j)i) if cr*+i = /X 

1 {'i'i-i-i ^ M(aTfi)} U Vifi) if cr*+i = v 



for i>0 



Note that for each Xi in f there is exactly one proposition in P{4>). Observe 
that if Xfc appears in then the literal corresponding to appears in negated 
form in the program clause corresponding to if and only if ai yf a^. Thus, 
negative dependencies are introduced in P{(j)) only if the corresponding variables 
in (j) differ in sign. In an alternation-free BES, the dependency between opposite- 
signed variables is cycle-free. Hence we have the following proposition: 



Proposition 5.2. If f is an alternation-free boolean equation system then V{4>) 
is a stratified logic program. 



Stratified logic programs have unique stable models which can be evaluated 
in polynomial time. Thus, the logic programs generated from alternation-free 
BESs can be efficiently evaluated. 
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We complete the translation by defining a mapping between sign maps of 
BESs and preference sequences for the corresponding logic programs. 

Definition 5.3. The translation function V maps the sign map a of a BBS of 
size n to the preference sequence {hjh, ■ ■ ■ ,ln) such that for all 1 < i < n: 

1 . _ I */ CTi = M 
* \ -■% if (Ti = v 

We have overloaded the symbol V to denote the translation functions that map 
different aspects of the BES to logic programs with preferences. Note that V{a) 
is a complete preference sequence whenever cr is a sign map of a closed BES. 

6 Solutions to Boolean Equation Systems 
Are Preferred Stable Models 

We now show that given a closed BES 4> of size n, its solution can be obtained 
from the preferred stable model of 'P{4>)- This is done by showing that 

(i) the preference order among fixed points in a BES corresponds to the order 
imposed by preference sequences, 

(ii) every stable model of P(</)) is a fixed point of f), and 

(iii) the preferred fixed point of (/) is a stable model of 'P{4>). 

We first formalize the relationship between the valuations of a BES and 
the models of the corresponding logic program. Given a valuation v over y = 
{Xi, X 2 , . ■ let P(w) be a substitution over A = {pi,p 2 , ■ ■ ■ ,qi,q 2 , ■ ■ ■} such 
that for all i > 0, V{v){pi) = v{Xi) and V{v){qi) = -^v{Xi). Similarly, for any 
substitution w over A satisfying w{pi) = ->w{qi) for every i, we write V~^{w) to 
denote the valuation over y such that P~^{w){Xk) = w{pk)- 

The correspondence between valuations and substitutions, and between sign 
maps and preference sequences, is formalized in the following theorem: 

Theorem 6.1. Let (j> he a closed BES of size n with sign map a and let S = P(cr) 
be a preference sequence. For any pair of valuations vi,V 2 over {Xi,X 2 , ■ ■ ■ , Xn}, 
vi E™ V 2 iffV{vi) Qs P{v 2 )- 

The following theorem formally states the second step needed for establishing 
the correspondence between preferred fixed points and preferred stable models. 

Theorem 6.2. Let 4> he a BES of size n and v be a valuation. Lf u is a stable 
model of 'P{4>n)[P{v)] then V~^{u) is a fixed point off) w.r.t. the valuation v. 

The proof follows from the definition of stable models and the translation map- 
ping V. 

The third step would be trivial if the converse of Theorem 6.2 were true. It 
turns out, however, that not all fixed points of a BES correspond to stable models 
of the translated program. This mismatch arises because the definition of fixed 
points ignores the signs of the equations as well as the order of nesting, while 
the translation from BES to logic programs ignores only the order of nesting. 
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Thus even for non-nested BBSs the set of fixed points may be larger than the 
set of stable models of the corresponding program. For example, consider the 
BES (j) with equations {Xi = X 2 , X 2 = Xi) and sign map (/i, ^). The BES has 
two fixed points: Vi = {Xi = 0,^2 = 0) and V 2 = {Xi = 1 ,X 2 = 1), with vi 
as the solution. The program V{4>) is {pi <— P 2 ,P 2 ^ Pi}, and has only one 
stable model {}. Similarly the system S 3 in Example 2.3 (Section 2) has three 
fixed points but of these only two ((Xi = 0,^2 = 0) and (Xi = 1,^2 = 1)) 
correspond to stable models ({gi} and {pi,P 2 }, respectively) of the translated 
program. 

Thus, we need to show that we have not “lost” the solution to a BES in the 
translation, as formally stated by the following theorem. 

Theorem 6.3. Let 4> he a closed BES of size n and v he a valuation. Then, for 
all n, ’P(|<('n]('c)) is a stable model of V{(j)n)\P{v)]. 

The proof is by induction on the size of the BES 4>. The proof of the base 
case relies on the fact that, when ui = p, pi does not appear in negative literals 
in M{ai), and hence V {4)i)[V {v)] I V {{(fijiv)) = {{pi ^ M{ai))}[V{v)]-, sym- 
metrically, when a I = v, qi does not appear in negative literals in Af(ol), and 
hence V {(f)i)\P {v)] ^ 7^(|</'i](r’)) = {(<Zi ^ M(fai))\P{v)]}. The difficulty in the 
induction step arises from the following: the definition of the stable model allows 
for substitution of values only for the negative literals. Consequently, the syntax 
of the resulting program is such that one cannot directly apply the induction 
hypothesis. To get around this, we “approximate” this program (from “below” 
in the p case and from “above” in the v case) by one where the hypothesis is 
applicable. The details are available in [10]. 

Thus, the stable models of P(<(') correspond (via the translation function V 
over valuations) to a subset of the set of fixed points of (p that contains the pre- 
ferred fixed point of (p. Since preference orders over valuations and substitutions 
coincide (from Theorem 6.1) it is easy to establish from Definition 4.5 that the 
preferred stable model of V{p) corresponds to the preferred fixed point of p. 

Corollary 6.4. Let p he a closed BES with sign map a and let S = V{(j). Then 
PEP{p) = {u} and PSM s{V{p)) = {ru} such that v = 'P~^{w). 

7 Conclusions 

We have shown how to compute alternating fixed points of boolean equation 
systems by translating a given equation system p into a propositional normal 
logic program V{p), and computing the preferred stable model of V{p). 

Our results provide the basis for extending the XMC model checker, which 
currently handles only the alternation-free fragment of the modal mu-calculus, to 
the full mu-calculus. XMC casts model checking as a query-evaluation problem 
over logic programs with stratified negation and uses the XSB logic-programming 
system [20] for the actual evaluation. For formulas with alternating fixed points, 
the resulting logic program may be non-stratified. XSB, while computing the 
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well-founded model for such a program, produces a residual program that ef- 
fectively summarizes the cycles with negation in the original program. We can 
then evaluate the preferred stable model of the residual program using the stable 
models generator of [13] or the DLV system [6] as the core engine. The answer 
to the original model-checking question can be directly obtained from the model 
so computed. Experimental results have shown that the residual programs so 
derived are typically small and the preferred stable model can be calculated 
efficiently [11]. 

We have also shown that for alternation-free boolean equation systems, the 
logic programs we derive are stratified. Consequently, our mapping of boolean 
equation systems to logic programs preserves the linear-time complexity of eval- 
uating solutions of such equation systems established in [5]. We moreover con- 
jecture that for BESs with alternating fixed points, time complexity exponential 
in the “alternation depth” of the equation system can be attained, again match- 
ing the best upper bound known to date. This result would depend critically on 
the use of the Gelfond-Lifschitz transformation to steer the computation of the 
preferred stable models of the non-stratified logic programs that our translation 
produces in the case of alternating fixed points. 
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A Modal Mu-Calculus and Boolean Equation Systems 

Formulas in the modal mu-calculus are constructed from existential (denoted by 
(•)) and universal (denoted by [•]) modalities; explicit greatest and least fixed- 
point operators (denoted by v and yt, respectively); formula variables that index 
the fixed points; and the traditional conjunction/disjunction operators and con- 
stants true and false from classical logic. Models of mu-calculus formulas are 
given in terms of sets of vertices (called states) of an edge-labeled graph called 
a labeled transition system (LTS). For example, the formula vX.{—)true A [—]X 
characterizes deadlock freedom: a state s that models X is such that it is possible 
to make a transition from s (i.e., the meaning of {—)true) and every destination 
state reached models X (the meaning of [— ]^). 
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V Ai = A2 A JC4 

V X2=X3 

V X3 = X2 

V X4 = false 



vX.{—)true A [— ]A 



(b) 



Xi = false 
X2 = true 
Xz = true 
X4 = false 

(d) 



Fig. 1. Example Labeled Transition System (a), mu-calculus formula for deadlock free- 
dom (b), corresponding Boolean Equation System (c), and its solution (d). 



Fixed points in a mu-calculus formula may be nested: i.e., a fixed-point for- 
mula (j>i may occur in the scope of another fixed point formula (f> 2 - We then say 
that the outer formula 4>2 depends on the inner formula <f>i. Moreover, the inner 
fixed-point formula (j)i may refer to the variable indexing the outer fixed point 
(f> 2 , thereby making <f>i and 4>2 mutually dependent. Formulas where the mutu- 
ally dependent fixed points have different fixed-point operators (i.e., /i and v) are 
called alternating fixed-point formulas. An example of an alternating fixed-point 
formula is: 



vX.^iX' .[a]X ^[-a]X' 

which expresses the property that transitions labeled a occur infinitely often 
along every infinite path of the system. 

The problem of determining the model of a modal mu-calculus formula 
w.r.t. a given LTS can be reduced to solving boolean equation systems (BESs). 
For example, consider the LTS of Figure 1(a) and the formula for deadlock 
freedom. To determine the formula’s model, it suffices to solve the BES of Fig- 
ure 1(c): each variable Xi indicates whether or not state Si of the LTS is in the 
model. The solution, given in Figure 1(d), reflects the fact that states S 2 and S 3 
are in the model while si and S 4 are not. 
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Abstract. We extend a theorem by Frangois Fages about the relation- 
ship between the completion semantics and the answer set semantics of 
logic programs to a class of programs with nested expressions permitted 
in the bodies of rules. Fages’ theorem is important from the perspective 
of answer set programming: whenever the two semantics are eqnivalent, 
answer sets can be compnted by propositional solvers, such as SATO, in- 
stead of answer set solvers, such as smodels. The need to extend Pages’ 
theorem to programs with nested expressions is related to the use of 
choice rules in the input language of SMODELS. 



1 Introduction 

This note continues the line of research on the relationship between two theories 
of negation as failure — one based on program completion [4], the other based 
on stable models, or answer sets [9]. A syntactic condition that guarantees the 
equivalence of these two concepts was given by Fages [7]. Babovich, Erdem and 
Lifschitz [2] generalized Fages’ theorem and showed that results of this type 
can be applied to computing answer sets for a logic program: whenever the two 
semantics are equivalent, answer sets can be computed by propositional solvers, 
such as SATO [16] or relsat [3], instead of answer set solvers, such as smodels 
[13] and dlv [6]. 

This possibility is important from the perspective of answer set programming. 
The idea of this programming method is to reduce a given computational prob- 
lem to computing an answer set for a logic program. Examples and references 
can be found in [11]. 

Lloyd and Topor [12] generalized the completion semantics to programs con- 
taining nested expressions (formulas) in the bodies of rules. A similar general- 
ization of the answer set semantics was proposed in [10]. In this note we show 
how Fages’ theorem on the relationship between completion and answer sets can 
be extended to these more general programs. We argue that this extension of 
Fages’ theorem is important in connection with some of the syntactic constructs 
available in the input language of system smodels. 
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Consider a simple example. Program 

p not not p, ■. 

p^p,q ^ 

contains nested occurrences of negation as failure in the body of the first rule.^ It 
belongs to the syntactic class for which our theorem guarantees the equivalence 
of the answer set semantics to the completion semantics. This program has two 
answer sets 0, {p}] they are identical to the models of the completion of (1): 

p = -.-.p V (pA g), , . 

q = ±. 

In the syntax of SMODELS, the first of rules (1) can be written as a “choice 
rule” {p}. The relationship between rules like this and nested expressions is 
systematically studied in [8]. 

In the next section, we discuss a more interesting example of this kind — a 
formalization of the 8-queens problem in the language of SMODELS. Section 3 is 
a review of answer sets and completion for programs with nested expressions. 
After extending Pages’ syntactic condition — “tightness” — to programs of this 
kind, we introduce our generalization of Pages’ theorem (Section 4). The proof 
is expressed in terms of a generalization of the model-theoretic counterpart of 
completion introduced in [1], called supportedness (Sections 5, 6). 

2 The 8-Queens Problem 



In the 8-queens problem, the goal is to find a configuration of 8 queens on an 
8x8 chessboard such that no queen can be taken by any other queen. In other 
words, no two queens may be on the same row, on the same column, or on the 
same diagonal. 

The 8-queens problem can be presented to SMODELS as in Pigure 1. The 
choice rule that follows the definitions of row and column instructs SMODELS to 
select atoms of the form occupied {R,C) for including in an answer set in such 
a way that, for every column C, exactly one atom occupied{R,C) be selected. 
The program has 92 answer sets, corresponding to all possible arrangements of 
8 queens. Given this input file, SMODELS produces one of the solutions: 

Stable Model: occupied(4, 1) occupied(2,2) occupied(7,3) 
occupied(5 ,4) occupied(l , 5) occupied(8,6) occupied(6,7) 
occupied(3,8) 

^ The double negation in the hrst rule of (1) is redundant from the point of view of 
the completion semantics, but it does affect the program’s answer sets. On the other 
hand, the second rule is redundant from the point of view of the answer set semantics, 
but, generally, dropping a rule like this can change a program’s completion in an 
essential way. 
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row(l . . 8) . 
column (1 . . 8) . 

l-[occupied(R,C) :row(R)M column (C) . 

occupied(R,C) , occupied(R,Cl) , 
row(R) , column (C) , column(Cl) , C < Cl. 

occupied(R,C) , occupied(Rl,Cl) , 
row(R) , column(C) , row(Rl) , column(Cl) , 
C < Cl, abs(R - Rl) == abs(C - Cl). 



Fig. 1. The 8-queens problem presented to SMODELS 

Consider the grounded version of the program in Figure 1, with the domain 
predicates row and column dropped from the rules. It consists of the rules 

l{occupied{l, C ), . . . , occupied{8, C)}! (3) 

for all C in {1, ... , 8}, 

t— occupied{R,C), occupied{R,Cl) (4) 

for all R, C, Cl in {1, . . . , 8} such that C < Cl, and 

^ occupied{R, C), occupied{Rl, Cl) (5) 

for all R, Rl, C, Cl in 8} such that C < Cl and \R — i?l| = |C — Cl|. 

Rewritten in terms of nested expressions, as described in [8, Section 6], rule (3) 
becomes 

occupied (R,C) ^ not not occupied (R,C) 

^ occupied{R,C), occupied{Rl,C) {R < Rl) (6) 

not occupied{l, C), . . . , not occupied{8, C). 

The first of these rules allows each of the atoms occupied{R, C) to be included or 
not included in an answer set arbitrarily. The second rule prohibits the selections 
that include more than one atom occupied {R,C) with the same value of C. The 
third rule prohibits the selections that include no such atoms for some value of 
C. 

To sum up, the program from Figure 1 can be rewritten as the union of 
programs (6), (4) and (5). 

The results of this paper show that the answer sets for the program consisting 
of these rules can be equivalently described as the models of the program’s 
completion. Consequently, an answer set for this program can be found by the 
Causal Calculator (ccalc)^ — a system that can compute the completion of a 
program, even if the program contains nested expressions, and then can call a 
propositional solver to find a model of the completion. 

http : //www. cs .utexas . edu/users/tag/ cc/ . 
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We present the union of programs (6), (4) and (5) to CCALC as shown in 
Figure 2.^ CCALC produces the following output, using SATO to find a model: 

Satisfying Interpretation: occupied(l ,5) occupied(2,8) 

occupied(3,4) occupied(4, 1) occupied(5,3) occupied(6,6) 
occupied(7,2) occupied(8,7) 



sorts 

row; column. 

variables 

R,R1 : : row; 

C,C1 : : column. 

constants 

1 . . 8 : : row; 

1 . . 8 : : column; 

occupied (row, column) :: cwAtomicFormula. 
occupied(R,C) not (not occupied(R,C) ) . 

occupied(R,C) ,occupied(Rl ,C) , R < R1 . 

(/\R: -occupied(R,C) ) . 

occupied(R,C) , occupied(R,Cl) , C < Cl. 

occupied(R,C) , occupied(Rl,Cl) , 

C < Cl, abs(R-Rl) =:= abs(C-Cl). 

Fig. 2. The 8-queens problem presented to CCALC 

In our experiments, SMODELS took 0.06 seconds to find a solution to the 
8-queens problem, and SATO took 0.01 seconds. For 20 queens, the run time of 
SMODELS was 200 seconds, and the run time of SATO was 0.08 seconds. For 30 
queens, SMODELS did not terminate after many hours of search; the run time of 
SATO was only 0.29 seconds."* This example shows that, given a representation 
of a computational problem in the input language of SMODELS, it is sometimes 

® In CCALC, sorts of variables correspond to domain predicates of SMODELS. The sym- 
bol cwAtomicFormula in the declaration of occupied means “an atomic formula 
satisfying the closed world assumption” . 

* We used smodels 2.26 with lparse 0.99.59, and ccalc 1.9 with sato 3.2. The 
times do not include the preprocessing done by lparse for smodels and by CCALC 
for SATO. The constraint logic programming system CLP [15] is computationally even 
more efficient in application to the n-queens problem: it takes 0.01 seconds to find 
a solution for 20 queens. According to [14], this problem can be also solved quickly 
using the abductive logic programming system SLDNFAC [5]. 
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faster to find a solution by running a propositional solver on the completion 
of the corresponding program with nested expressions than by using SMODELS 
itself. 

3 Programs 

The words atom and literal are understood here as in propositional logic; we 
call the sign -■ in a negative literal -^A classical negation to distinguish from 
the symbol for negation as failure (not). Elementary formulas are literals and 
the 0-place connectives _L and T. Formulas are built from elementary formulas 
using the unary connective not and the binary connectives , (conjunction) and 
; (disjunction). A (nondisjunctive) rule is an expression of the form 

Head ^ Body (7) 

where Head is a literal or _L, and Body is a formula.® If Head = _L, we will 
sometimes drop _L from the head; a rule with Head = _L is called a constraint. 
A (nondisjunctive) program is a set of rules. 

We define when a consistent set X of literals satisfies a formula F (symboli- 
cally, X \= F) recursively, as follows: 

- for elementary F,X\=FifFGXorF = T, 

- X \= not F a X ^ F, 

- X \= (F,G) if X \= F and X \= G, 

- X ^ If;G) if X ^ F or X ^ G. 

A consistent set X of literals is closed under H if, for every rule (7) in 7T, 
Head G X whenever X \= Body. 

Let H he a program without negation as failure. We say that X is an answer 
set for 77 if A is minimal among the consistent sets of literals closed under 77. 
It is easy to see that there can be at most one such set. For instance, the answer 
set for the program 

(8) 

p^p,q 

is {p}. 

The reduct 77^ of a program 77 relative to a set X of literals is obtained 
from 77 by replacing every maximal occurrence of a formula of the form not F 
in 77 (that is, every occurrence of not F that is not in the range of another not) 
with T if A ^ F, and with T otherwise.® A consistent set A of literals is an 
answer set for 77 if it is the answer set for the reduct 77^. For instance, {p} is 
an answer set for (1) since it is an answer set for the reduct (8) of (1) relative 
to {p}. 

® In [10], the syntax of rules is more general: the head may be an arbitrary formula, 
in particular a disjunction. 

® This definition is equivalent to the recursive definition of the reduct given in [10] . 
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We say that a formula (or a program) is normal if it does not contain classical 
negation. A normal formula F can be identified with the propositional formula 
obtained from F by replacing every comma with A, every semicolon with V, and 
not with - 1 . 

Consider a finite normal program 77. The completion of 77 is defined as 
follows. If A is an atom or the symbol _L, by Comp{F[, A) we denote the propo- 
sitional formula 

A = {Body I V ... V Body/.) (9) 

where the disjunction extends over all rules 

A -ir- Body^ (10) 

in 77 with the head A7 The completion of FI is the set of formulas Comp{F[, A) 
for all A. For instance, the bodies of rules (1), written as propositional formu- 
las are -<-<p and p A q; for this program 77, formulas (2) are Comp {FI, p) and 
Comp{n,q). In addition to these two formulas, the completion of this program 
includes also Comp{II, T), which is the tautology T = T. 

As in the case of normal programs without nesting, an answer set for any 
normal program 77 satisfies the completion of 77. In the next section we will see 
under what conditions the converse is true. 



4 A Generalization of Fages’ Theorem 



Although the theorem below applies to normal programs only, it is convenient 
to define tightness for programs that may contain classical negation. 

A formula is called negative if every occurrence of an atom in this formula is 
in the scope of a negation as failure. 

We consider nondisjunctive programs consisting of rules of the form 

Head^P,N (11) 

where P is an arbitrary formula, and TV is a negative formula. In applications, P 
will be usually a formula that does not contain negation as failure. Any nondis- 
junctive program can be turned into a program of this kind using the “equivalent 
transformations” discussed in [10, Section 4]. For instance, the rule 



p <— q; not r 



does not have the form (11), but it can be equivalently replaced by the pair of 
rules 



p^ q,T 
p ^ T, not r. 



(12) 



In fact, some disjunctive programs can be converted to this special form as well; 
for instance, 

p; not q <— r 

^ This is essentially the definition from [12] restricted to the propositional case. 
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can be equivalently replaced by 



p r, not not q 



([10], Proposition 6(iii)). 

An occurrence of a formula A in a formula is singular if the symbol before 
this occurrence is otherwise, the occurrence is regular. It is clear that the 
occurrence of F can be singular only if F is an atom. For any formula G, by 
lit(G) we denote the set of all literals having regular occurrences in G. For 
instance, lit{p;not ~<r) = {p,~<r}. 

Now we are ready to show how Fages’ syntactic condition is extended to 
programs with nested expressions. 

A program 7T whose rules have the form (11) is tight on a set X of literals if 
there exists a function A from X to ordinals satisfying the following condition: 

(*) for every rule (11) in U, if Head G X and X f= {P,N) then, for all 
LGXn lit{P), \{L) < \{Head). 

For instance, program (1) written as 

p ^ T, not not p, 

P ^ (P,<?),T 

is tight on {p}: take \{p) = 0. Program (12) is tight on {p,q,r}: take A(p) = 1, 
\{q) = A(r) = 0. 

If the program is finite then, without loss of generality, the values of A can 
be assumed to be finite. 

For programs whose rules are of the form 

Head <— Li, . . . , Lm, not Lm+i, • • ■ , not (13) 

where each is a literal, the definition of tightness above is slightly more general 
than the definition given in [2]. If we take P to be Li, . . . , Lm and N to be 
not Lm+i, • ■ • , not L„ then the condition X ^ (P, N) can be written as 

Pi , . . . , Ljyi G a, Ljyi+1 5 • ■ • , Lji ^ A, 

and it implies that A fl lit{P) = {Pi, . . . , P^}- In [2], the restriction Pm+i, • ■ • , 
P„ ^ A is not included in the definition of tightness.® For instance, the program 

p^T,T 
p <— p, not q 

is tight on {p,q} in the sense of (*), but it is not tight in the sense of [2]. Note 
that {p, q} is both a model of the completion of this program and its answer set. 
Our generalization of Fages’ theorem is stated as follows: 

® The possibility of including this restriction was suggested to us by Hudson Turner 
on May 2, 2001. 
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Theorem 1 For any finite normal program II whose rules have the form (11) 
and any set X of atoms such that II is tight on X, X is an answer set for U iff 
X satisfies the completion of II. 



For instance, program (1) is tight on the models 0, {p} of its completion (2). 
By the theorem above, it follows that 0 and {p} are the answer sets for (1). 

Theorem 1 is more general than Fages’ original theorem [7] in three ways. 
First, we defined tightness relative to a set X; Fages’ definition corresponds to the 
special case where X is the set of all atoms. Second, Fages’ definition is limited 
to rules of form (13). Third, it does not include the improvement mentioned in 
footnote (^). 

In the modification of Theorem 1 stated below, the rules of 77 are not required 
to have the form (11), and tightness is replaced by a simpler condition. We say 
that a literal L is a positive body element of a nondisjunctive program 77 if 
77 contains a rule Head •<— Body, with Head yf _L, such that Body contains a 
regular occurrence of L that is not in the scope of negation as failure. We denote 
by pos{H) the set of positive body elements of 77. For instance, if 77 is (1) then 
pos{n) = {p,q}', if 77 is (4)-(6) then pos{H) = 0. 

Theorem 2 For any finite normal program 77 and any set X of atoms disjoint 
from pos{n), X is an answer set for H iff X satisfies the completion of H. 

For instance, this theorem shows that all models of the completion of the 
8-queens program (4)-(6) are answer sets. 

Theorems 1 and 2 are proved in the next two sections. 



5 Supported Sets 

The proof of Theorem 1 below is expressed in terms of a model-theoretic coun- 
terpart of completion — supportedness. 

We say that a set X of literals is supported by a program 77 if, for every 
literal L € X, there exists a rule (7) in 77 such that Head = L and X ^ 
Body. In application to finite normal programs, the combination of closure and 
supportedness exactly corresponds to the program’s completion: 

Proposition 1 For any finite nondisjunctive normal program 77, a set of atoms 
satisfies the completion of H iff it is closed under and supported by 77. 



Proof. Let 77 be a finite nondisjunctive normal program. Recall that the com- 
pletion of 77 consists of equivalences (9) where A is an atom or the symbol _L. 
It is clear that a set X satisfies the completion of 77 iff, for each A, 



(a) for every rule (10) in 77 with the head A, ii X \= Body^ then A G X, and 

(b) A A G X then X (= Body^ for some rule (10) in 77 with the head A. 
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(Given the convention about writing normal formulas in the syntax of propo- 
sitional logic introduced in Section 3, the relation X \= F restricted to normal 
formulas F is equivalent to the satisfaction relation of classical logic.) Condition 
(a) expresses that X is closed under 77, and condition (b) expresses that X is 
supported by 77. 

An answer set for a program 77 is closed under and supported by 77 : 

Proposition 2 For any nondisjunctive program 77 and any consistent set X of 
literals, if X is an answer set for FI then X is closed under and supported by 77. 

Under the tightness condition, the converse holds as well — the sets closed 
under and supported by 77 are answer sets for 77: 

Proposition 3 For any program 77 whose rules have the form (11) and any 
consistent set X of literals such that 77 is tight on X, if X is closed under and 
supported by 77 then X is an answer set for 77. 



Propositions 2 and 3 generalize Theorem 1 to programs that may contain 
classical negation and may consist of infinitely many rules. The proofs of these 
propositions are given in the next section. Theorem 1 is an immediate conse- 
quence of Propositions 1-3. 



6 Proofs 

Lemma 1 Given a formula F without negation as failure and two sets Z, Z' of 
literals such that Z' C Z, if Z' \= F then Z \= F. 

Proof. Immediate by structural induction. 

The following lemma is the special case of Proposition 2 in which 77 is as- 
sumed to be a program without negation as failure. 

Lemma 2 For any nondisjunctive program 77 without negation as failure and 
any consistent set X of literals, if X is an answer set for 77, then X is closed 
under and supported by IF. 

Proof. Let IF he a, nondisjunctive program without negation as failure and X 
be an answer set for 77. By the definition of an answer set for programs without 
negation as failure, X is closed under IF. To prove supportedness, take any literal 
L in X. Since X is minimal among the sets closed under 77, A\{L} is not closed 
under 77. This means that 77 contains a rule (7) such that X \ {L} \= Body but 
Head ^ X \ {L}. By Lemma 1, X \= Body. Since X is closed under IF, it follows 
that Head G X, so that Head = L. 

The definition of the reduct F^ of a formula F is similar to the definition of 
the reduct of a program given in Section 3. 
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Lemma 3 For any formula F , any nondisjunctive program F[ , and any consis- 
tent set X of literals, 

(i) X^F iffX h F^; 

(a) X is closed under II iff X is closed under 11^ ; 

(Hi) X is supported by II iff X is supported by 11^ . 

Proof. Part (i) is immediate by structural induction; parts (ii) and (iii) follow 
from (i). 

Proof of Proposition 2. Consider a nondisjunctive program II and an answer 
set X for n . By the definition of an answer set, X is an answer set for 77^. Then, 
by Lemma 2, X is closed under and supported by 7T^. By Lemma 3(ii,iii), it 
follows that X is closed under and supported by 77. 

Lemma 4 For any formula F, and any set X of literals, X\=F iff Xr\lit{F) \=F. 
Proof. Immediate by structural induction. 

The following lemma is the special case of Proposition 3 in which 77 is as- 
sumed to be a program without negation as failure. 

Lemma 5 Let II be a program without negation as failure whose rules have the 
form (11). For any consistent set X of literals such that 77 is tight on X, if X is 
closed under and supported by 77, then X is an answer set for 77. 

Proof. By the definition of an answer set for programs without negation as 
failure, we need to show that no proper subset of X is closed under 77. Let Y 
be a proper subset of X, and let A be a function from X to ordinals satisfying 
condition (*) from the definition of tightness (Section 4). Take a literal L G X\Y 
such that \{L) is minimal. Since X is supported by 77, there is a rule 

P,N 



in 77 such that 

(14) 

By the choice of A, for all L' G X D lit{P), 

\{L') < \{L). 

By the choice of L, no literal L' satisfying this inequality may belong to AT \ P, 
so that X n lit{P) is disjoint from X\Y. Consequently, X fl lit{P) C Y. Since 
77 does not contain negation as failure, and TV is a negative formula, lit{N) = 0. 
Consequently, lit{P,N) = lit{P), so that 

Xnlit{P,N)CY. (15) 

By Lemma 4, we can conclude from (14) that 

Xnlit{P,N) h (P,N). 

In view of (15), it follows by Lemma 1 that Y |= (P, TV). Consequently, Y is not 
closed under 77. 
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Lemma 6 For any program II whose rules have the form (11) and any consis- 
tent set X of literals, if 77 is tight on X then so is 77^ . 

Proof. Let II he a program whose rules have the form (11) and X be a consistent 
set of literals. Suppose that 77 is tight on X. Then there exists a function A from 
X to ordinals such that, for every rule (11) in 77, if Head G X and X ^ (T’, TV) 
then, for all L G 77 fl lit{P), 



\{L) < \{Head). 

By Lemma 3(i) and the definition of the reduct, X \= (P, TV) iff 77 ^ , N^). 

By the definition of the reduct, lit{P^) C lit{P). Therefore, for every rule (11) 
in 77, if Head € 77 and 77 1= (P^,N^) then, for all L G 77 fl lit(P^), the 
inequality above is satisfied. Since every rule of 77^ has the form 

Head ^ P^,N^ 

for some rule (11) in 77, it follows that 77^ is tight on 77. 

Proof of Proposition 3. Consider a program H whose rules have the form 
(11) and a consistent set 77 of literals such that 77 is tight on 77. Assume that 
77 is closed under and supported by 77. Then, by Lemma 3(ii,iii), 77 is closed 
under and supported by 77^. By Lemma 6, since 77 is tight on X, so is 77^. 
Hence, by Lemma 5, X is an answer set for 77^, and consequently an answer 
set for 77. 

Proof of Theorem 2. Let 77 be a finite normal program and let A be a set 
of literals disjoint from Tit (77). As discussed in Section 4, the transformations 
from [10] allow us to turn 77 into a program 77' with the same answer sets such 
that every rule of 77' has the form (11) where P does not contain negation as 
failure. The examination of these transformations shows that the completion of 
77' is equivalent to the completion of 77. Furthermore, these transformations 
do not change the set of positive body elements, so that pos{H') = pos{H). 
Consequently, X is disjoint from pos{H'), which implies that X is disjoint from 
lit{P) for every rule (11) of 77'. It follows that condition (*) from the definition 
of tightness (Section 4) applied to 77' holds trivially for any choice of A. Then, 
by Theorem 1, A is an answer set for 77' iff A satisfies the completion of 77'. 
Consequently, this condition holds for 77 as well. 

7 Conclusion 

We extended the theorem by Frangois Fages that describes the relationship be- 
tween the completion semantics and the answer set semantics of logic programs 
to programs with nested expressions permitted in the bodies of rules. The study 
of this relationship is important from the perspective of answer set programming, 
and nested expressions are interesting because of their relation to some syntactic 
constructs that play an important role in the input language of SMODELS. Ex- 
periments show that, given a representation of a computational problem in that 
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language, it is sometimes faster to find a solution by running a propositional 
solver on the completion of the corresponding program with nested expressions 
than by using SMODELS. 
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Abstract. The aim of our work is the definition of a model-theoretic se- 
mantics of normal logic programs with embedded implications. We first 
propose a quite simple operational semantics for this class of programs 
whose negation mechanism is the constructive negation. This semantics 
is used to prove the adequacy of the model-theoretic semantics. Then 
we define a declarative semantics for this class of programs in terms of 
Beth models and show that in the model class associated to every pro- 
gram there is a least model that can be seen as the semantics of the 
program, which may be built upwards as the least fixpoint of a con- 
tinuous immediate consequence operator. Finally, it is proved that the 
operational semantics is sound and complete with respect to the least 
fixpoint semantics. 



1 Introduction 

There are two main approaches to decompose large logic programs into man- 
ageable units [2]. On the one hand, different kinds of modular units, similar to 
the module notions existing in other programming paradigms, have been pro- 
posed and studied together with the corresponding composition operations. This 
kind of modularization can be considered “external” to the logic programming 
paradigm, since it is based on the use of constructions which are, to a certain 
extent, independent of any programming formalism [10]. Conversely, the second 
approach provides a structuring mechanism in terms of a logical connective. In 
particular, this approach originated in the work of Miller [9] . The idea is that 
intuitionistic implication may be used to structure a logic program into blocks, 
as it is done in imperative programming languages. For more details on the use 
of this connective the reader may look, for instance, at [2]. 

Providing semantics to normal logic programming units is, in general, a dif- 
ficult task, due to the non-monotonic nature of negation that hinders the defi- 
nition of a compositional semantics. Nevertheless, a certain amount of work has 
been done for defining the semantics of normal logic program modules according 
to the first approach mentioned above (see, e.g., [6,4]). This is not quite true 
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for the second approach. To our knowledge, only [1] and [7] consider this case. 
The reason for this lack of work is, probably, the apparent difficulty in mixing 
two semantically very different connectives such as negation and intuitionistic 
implication. Intuitionistic connectives seem to ask for intuitionistic models, like 
Kripke or Beth models, while negation seems to ask for some kind of 3-valued 
models. The semantics presented in [1] and [7] is defined in terms of some sort 
of Kripke models. The results presented in [1] are very restrictive. Only the case 
of negation as failure is considered, programs must be stratified and signatures 
may only contain predicate symbols, i.e. function symbols are not allowed. The 
work presented in [7] also deals with negation as failure, but the other restric- 
tions are not present. In addition, we found the model-theoretic semantics quite 
ad-hoc. The kind of Kripke models used are not really intuitionistic: first, the 
interpretation associated to a given world is a three- valued structure and, second 
and more importantly, the order relation between worlds is not monotonic, in 
contrast with the intuition underlying intuitionistic Kripke models. This means 
that if an atom is satisfied by the interpretation associated to a given world, 
then it may not be satisfied by the interpretation associated to a greater world. 

In [8] we defined a new declarative compositional semantics for a general 
class of normal logic program units, in terms of a class of models that we called 
ranked. As we pointed out in that paper, ranked models are, intuitively, quite 
close to Beth models. This lead us to think that both connectives could have a 
natural and reasonably simple semantics in terms of intuitionistic (Beth) mod- 
els. Moreover, this semantics would make more explicit the intuitionistic nature 
of negation in logic programming already pointed out by other authors (e.g. 
[12]). In this paper we follow these ideas. We first propose a quite simple op- 
erational semantics for this class of programs whose negation mechanism is the 
constructive negation [3,5,13]. This semantics is used to prove the adequacy of 
the model-theoretic semantics. Then we define a declarative semantics for this 
class of programs in terms of Beth models and show that in the model class asso- 
ciated to every program there is a least model that can be seen as the semantics 
of the program, which may be built upwards as the least fixpoint of a continu- 
ous immediate consequence operator. Finally, it is proved that the operational 
semantics is sound and complete with respect to the least fixpoint semantics. 

This paper is organized as follows. In the following section we present some 
basic concepts and notation used in the paper. Section 3 introduces the oper- 
ational semantics of our programs. Section 4 defines the Beth models used in 
the paper and their associated forcing relation. The next section introduces the 
immediate consequence operator showing that it is monotonic and continuous. 
Section 6, defines an ordering relation on the class of models of a program and 
shows that the least model coincides with the least fixpoint of the immediate 
consequence operator. Finally, in Section 7, the operational semantics is proved 
to be sound and complete with respect to the least model semantics. 
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2 Preliminaries 

A countable signature S consists of a pair of sets {FSs,PSs) of function and 
predicate symbols with some associated arity. Terms, atoms or first order for- 
mulas built by using functions and/or predicates from S and, also, variables 
from a fixed countable set X of variable symbols are called A-terms, A-atoms 
and A-formulas, respectively. The Herbrand universe, denoted by is the set 
of all ground A-terms constructed by using A-functions. Terms are denoted by 
t,s , . . ., predicate symbols by p, g, . . . and function symbols by ■■ Letters 

a, b denote atoms and the character £ is used for literals. A formula whose sub- 
terms are variables is called a flat formula, and (p^ are the universal and 
existential closure of ip, respectively. The logical constants are denoted by t and 
f . Programs are denoted by using the letters P and Q. In general, subscripts 
and superscripts will be used if needed and a bar is used to denote (finite) se- 
quences of objects. Normal logic programs with embedded implications over a 
signature S are finite sets of clauses a : — Gi, . . . ,Gk, where a is a A-atom and 
Vt G {!,..., k}, Gi is an intuitionistic E-literal^ that is, either a A-literal, b or 
->b, where 6 is a A-atom; or an intuitionistic E-expression, Pi D G', where Pi is 
a A-program and G( is an intuitionistic A-literal. The idea behind this kind of 
goals is that, when evaluating G(, one may use the definitions in Pi as auxiliary 
local definitions (in addition to other global definitions in the given program). 
Clauses whose head is empty correspond to goals of this class of language, called 
intuitionistic E-goals. 

Free variables in a clause are assumed to be implicitly quantified universally. 
This means that the scope of a variable is the clause where it is defined. In 
particular, given the goal {p{x).} D p{f{x)), x in p{x) is not considered to be 
bound to X in p{f{x)) and, as a consequence, the goal should succeed. 

We consider that clauses are written following the structure of constraint 
normal clauses with flat head. That is, p{ti, . . . , t„) : — G\, . . . ,Gk is written as 
the constrained clause p{x\, . . . , a;„) : — Gi, . . . , G^ncci = G, . . . , 

Moreover, we suppose that the identical tuple xi, . . . of fresh variables 
occurs in all clauses (in a program) with predicate p in its heads. Also, just to 
simplify, clauses of the form a : — nt are written as a. 

The set of definitions of a predicate p in a program P is defined as usual: 

Def{P,p) = {p{x) g'^dc'^ G P} 

Constraints occurring in programs are equality E-eonstraints, that is, arbitrary 
first order A-formulas in which the only relational symbol occurring in atoms is 
the equality (formulas composing equality atoms with the connectives A, V, — >■, 

and the quantifiers V, 3) . Constraints are denoted by using the letters c and d 
(possibly with sub or super-scripts). We will handle constraints in a logical way, 
using logical consequence of the free equality theory, FETs (see, e.g., [12]). 

A constraint c is satisfiable (resp. unsatisfiable) if, and only if, FETs H ^ 
(resp. EET^ ^ -i(c^)); a constraint d is less general than c if, and only if, 
FETs ^ (d — >■ cY . A ground substitution x = t (where G are closed terms) is 
called a solution of a constraint c if, and only if, FETs |= (x = t — >■ c)'^. 
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A constrained A-atom is a pair p{x)ac{x), where p € PSs and c{x) is a sat- 
isfiable A-constraint. The set of all the constrained A-atoms is denoted Cs{X). 

3 Operational Semantics 

In this section we introduce an operational semantics for the class of normal logic 
programs with embedded implication. This semantics is presented in terms of a 
derivation relation over sequents of the form P h Gnc, where P is a A-program 
and : — Gnc is a A-goal. It may be noted that our semantics is very simple, 
but not very useful for practical purposes, since it is too non-deterministic to 
be directly implemented. Its main aim is to show the adequacy of the model- 
theoretic semantics defined below. In particular, our treatment of negation can 
be seen as a more abstract and simple variation of [3,5,13]. Instead, we could 
have introduced an operational semantics closer to implementation. For instance, 
a variation of the BCN semantics [11]. However, the proofs for the main results 
of this paper would have been slightly more complex. 

The following mutually recursive definitions establish our semantics. 

Definition 1. Let P be a S-program and : — Gnc a S-goal. P h Gnc can be 
proved with computed answer c' if and only if there exists a finite derivation of 
P h Gnc, that is, P h Gnc P h nc' , n > 0, FET^ \= c'^ where ^ corresponds 
to n applications (derivation steps) of the relation over sequents. 

Definition 2. The derivation relation over sequents is defined as follows: 

1. P \~ Gi,p{x), G 2 nc P h Gi, G, G 20 C A d if there exists a (renamed apart) 
clause p(x) : — Gnd G Def{P,p) and FETs |= (c A d)^ 

2. P \- G\,-'p{x),G 2 nc P h Gi,G 2 nc' if for every (renamed apart) clause 
p(x) : — Gi, . . . , Gmod there exists J C {!,..., m} such that \/j G J : P \- 
-<Gjnd can be proved with computed answer dj and FETs ^ (c' -A ~<d\/ 

3. P h Gi, Q D G, G 20 C P h Gi, G 20 c' if P U Q h Gdc can be proved with 
computed answer c' . 

4- P F Gi, -'(Q A G), G 2 DC P h Gi, G 20 c' if P U Q I 'Gnc can be proved 

with computed answer c' . 

We assume that whenever a constrained A-atom -i-ionc occurs in the right 
part of a sequent, it denotes one. 

Example 3. Given the programs P = {p{x) : — p{x)nx = a, q{x) — nx = a} and 
Q = {r : - p{x),^q{x)} 

P FQ D -T P F nt because (def. 2.3): 

P D Q I — T P U Q F nt (def. 2.2) 

PU Q \ — 'p(x) ^PUQ\~nx^a (def. 2.2) and 
P U (5 F q{x) =^PL)Q\-nx = a (def. 2.1) 
and PETs x ^ a\I x = aY 




Semantics of Normal Logic Programs with Embedded Implications 259 



4 Model Theory Semantics 

In this section, we introduce a class of Beth models, and an associated forcing 
relation, to define the semantics of our programs. Beth models and Kripke models 
are based on a similar intuition [14]. Both kinds of models are defined as a family 
of logical structures, where each structure, (corresponding to a world) denotes 
the amount of knowledge one has at a certain moment. Worlds are (partially) 
ordered, where the ordering relation denotes the increase of knowledge. In these 
models a forcing relation plays the role of satisfaction. Forcing is defined for 
each world and defines what one can expect to be true in the given world. The 
key difference between Kripke and Beth models is in the definition of the forcing 
relation. In Kripke models an atomic formula is forced in a world w if it is satisfied 
by the associated structure. In Beth models, an atomic formula is forced in w if 
we may be sure that it will be satisfied in the future. This is formalized saying 
that the formula is satisfied by all the structures in a bar for that world. Where 
a bar for w is a set of worlds such that any increasing sequence of worlds starting 
in w would contain a world in the bar. 

In our case, worlds are pairs (P, L), where P is a program and L is a set of 
constrained atoms. The structure associated to a world is also represented (as a 
variation of Her brand structures) as a set of constrained atoms. The intuition is 
that for a given world (P, L) , assuming that the given program includes all the 
clauses in P, we know that the atoms in L are false and the ones in the associated 
structure are true. Worlds can be seen as stages in computation, where additional 
computation provides additional knowledge. The idea is that the atoms in P, for 
a world (P, L), should be supported by the given program and by the knowledge 
in the previous worlds. In this context, forcing should be defined like for Beth 
models: an atomic formula is forced in a world if it will hold in the future. 

Example 4- To illustrate the ideas above, given P = {p : —~<q, q : —-<r, s : —~'p}, a 
model for P may include, for instance, the worlds: (0, 0), (0, {r}), (0, {p, r}) with 
the associated structures (0), ({g}), ({g, s}) In this model, the world (0, {r}) 
together with its associated structure {g} represent that, at certain stage, we 
may know that q is true but r is not. The fact that the program in these worlds 
is empty means that we may have this knowledge without assuming that we 
have additional clauses (others than the ones in P). However, we may consider 
that the atom s is forced in this world, because it holds in the following (larger) 
world. 

Definition 5. A E -world w is a pair {Pw, Lw) where P^ is a E -program up to 
variable renaming and C Cs{X). The set of all E-worlds is denoted Ws- A 
E-structure is a 3-tuple B = {W, I), where W C Ws and 

1. For every E -program P, (P, 0) G W 

2. ^ is a partial order on W , such that Vu, w & W \ v ^ w if, and only if, 
Pv = Pw and Ly C Ly,. The strict order associated to ^ is denoted 

3. The interpretation function I : W ^ satisfies the following proper- 

ties: 
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(a) Vf G W, tnc G I{v) if FETs |= 

(b) (Monotonicity) \/v,w G W, if v <w then I{v) C I{w) 

In addition, we consider that, 'iw G W, the sets of constrained atoms and 
I(w) are closed under renaming of variables, disjunction of constraints and less 
general constraints. Also, E(P) = (W(P), I(P)} corresponds to the ordered 

structure associated to P occurring in B such that W (P) = {ru G IF | (P, 0 ) ^ w} 
and I{P) = {I{w) I w G W{P)}. 

For simplicity, we will assume the following notational conventions: inc G 
(/(w), Lw) means one G I{w) if i = a, and one G if £ = -la. Conversely, we 
write -i^nc G {I{w), Lyf) to denote anc G Lyj if £ = a, and one G I{w) if t = ->a. 

Definition 6 (forcing). Let B = (IF, ^,/) he a S-structure. We say that B C 
IF zs a bar w.r.t. a world w G IF iff for each ^-increasing chain of worlds 
'UOj'Ci,-- - in IF such that vq = v, there exists k > 0 and w € B such that 
Vk ^ w ^ Vk+i- The bar B is strict iff\/v,w € B, v w and w :^v. Then, the 
forcing relation Ih, on a E-structure B is inductively defined for every world as 
follows. Let V G IF.' 

1. For all satisfiable constraints c and d: v,B Ih end, iff L{y) H (cA d)^. 

2. v,B Ih £nc, iff there exists a bar B C W with respect to v such that for all 
w G B : inc G (/b(w), L^). 

3. v,PlhGi,G2Dc iffv,B\^Ginc,v,B\^G2nc. 

4- v,B \\ — '(Gi, . . . , Gm)nc iffVjBW — <GjnCj for some j G J C {1, ... ,m} and 

PAT,; 

5. v,B \h P D Gnc iff there exists a satisfiable constraint c' , FETs |= (c' — >■ c)'^ 
such that {Pv U P, 0), P Ih Gnc' 

6. v,B II — '(P D Gnc) iff there exists a satisfiable constraint c' , FET^ ^ (c' -G 
cf^ such that (P„ U P, %),B II — 'Gnc' 

7. V, B Ih p{x) : — Gad iff^w F v if w,B \\- Gad then w, B Ih p{x)ad. 

A program P can be seen as an intuitionistic theory. As a consequence, 
one could just define the class of models defined by P as the subclass of all 
the structures such that P is forced by the world ( 0 , 0 ) (or, perhaps, by the 
world (P, 0 )). However, this is not satisfactory for our purposes. Many models in 
that class would not agree with the computational interpretation of our models 
discussed above. According to the definition below, a structure is a model of 
a program P if two conditions are satisfied. The first one is that the structure 
associated to a world (P', L) should satisfy all the consequences that could be 
computed from the clauses in P and in P' and the negative information in L. 
The second condition states that the negative information, L, in a world (P', L) 
must be supported by the clauses in P and in P' and the information included 
in previous worlds. 

Now, in order to formalize these intuitions we will define a notion of local 
forcing, which can be seen as a kind of local satisfaction on a given world. There 
are two key ideas in this definition. The first one is to consider that a positive 
literal £ is locally forced in a world w if f is in the interpretation of w, and a 
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negative literal I is locally forced in ru = (P, V) if I is in L. The second idea is to 
consider that a formula P' D ^ is locally forced in a world w = (P, P) if £ is locally 
forced in a bar for the world (PUP', 0) consisting of worlds (PUP', P') where 
P' is included in P. This means that in P we have enough negative information 
to compute 1. The extension to other kind of formulas is the obvious one. 

Definition 7 (Local forcing). The local forcing relation, Ih/, on a S-structure 
B = {W, <, I) is inductively defined for every world as follows. Let v € W, then: 

1. v,B\\~i (.nc iff {I{v),Lfi). 

2. v,B \\-i D ... D P” D £ac iff there exists a satisfiable constraint c' , 
FETs \= (c' — >■ c)'^ and there exists a bar B w.r.t. (P^ U P^ U . . . U P",0) 
such that Vu' € B, Ly' C P„ and lad G {I{v'), Ly/). 

3. Ih . . . D P” D l)ac iffv,B\[-iP^ D . . . D P” D ~^lac. 

4- v,B \\-i Gi,G 2 ac iff v, B \\~i Giac, v, B \\-i G 2 ac. 

5. V, B If-/ p{x) : — Gad iffdw P v if w,B Ih/ Gad then w, B Ih/ p{x)ad. 

The relation Ih/ is included in Ih, i.e. if v,B Ih/ Gac then v,B Ih Gac. 

Definition 8 (Models). B G Struct{F) is a model of P, written B \=d P, if, 
and only if, \/w G Wb, the following two conditions hold: 

1. dpfx) : — Gad GPU Py, : w, B Ih/ p(x) : — Gad 

2. Supported worlds: if pac € Ly, thendpfx) :—Gi, . . . ,Gm,ad € Def{PUPy,,p), 
there exist satisfiable constraints {dj}j^j, J C {1, . . . ,m} such that Vj G J 
3v G Wb, V < w with v, B Ih/ -iGjodj and FETs |= (c V Vje J 

For every program P, we define Mod{P) as the class of all its models. 

Example 9. Consider the program P = {r : — {p -iq} D s, s : — p}. A model 
of P could include any of the structures Bi or B 2 described below, where the 
sets at the right hand side of the worlds in Bi or B 2 denote their interpretation. 

f (0, {?,<?,«»{»•} 

I 

Bi = 1 (0,{p,«}){r} 

I 

[ (0,0)0 



({P 



- -■«?}> {g}){p. 

I 

({p ; - --g}. 0)0 



> B 2 = 



(0,{g}){p,s,r} 

I 

(0, 0){p, s, r} 



({p ; - --g}. 0){p, g, r, s} 



5 Least Fixpoint Semantics 

In this section, we define an immediate consequence operator Tp that can be 
used, as usual, to build (bottom-up) a least fixpoint of the operator, which 
is shown to be a model of the given program P. This fixpoint will be shown 
to be the least model in Mod{P), with respect to an ordering that will be 
defined in the following section. Moreover, the operational semantics defined 
above will be shown to be sound and complete with respect to that model. 
However, Tp is not defined on all A-structures as one would expect, since the 
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class Struct{S) includes some structures that are not constructive. Instead, Tp 
is defined on the subclass of Noetherian ^-structures, which is enough for our 
purposes. In particular, we show that our operator is monotonic and continuous 
for this subclass. 

Although the definition of Tp may look complex, the intuition is quite sim- 
ple. Given a A-structure A, Tp{A), on one hand, for every world (P',L) in A, 
Tp builds a new world, and the corresponding interpretation, where all the im- 
mediate negative consequences (w.r.t. the information we have at this point and 
the clauses in P and P') are added to L and all the immediate positive conse- 
quences are added to its interpretation. On the other hand, additional worlds 
can be added including the negative information that is supported by the exist- 
ing worlds in A. The interpretation of these additional worlds just includes the 
positive information that we have at this point. 

Definition 10. For all A G Struct{S), Tp{A) = {Wtp(a)j^Atp(A)) the 
structure in Struct(S) defined as follows for every S -program P' : 

1 . lTpp(^)(T') = {t pi {y)\v &Wa(,P')'\^{Succp>{v)\v is maximal in Wa{P')} 

where tp>{v) = (T',Tt^, („)) and Succp'{v) = (T', T 5 „ccp/(i>)) defined as 

follows for every v G Wa, 

Ptp,{v) = {poc\ for allp{x) :-Gi,...,GmDd€ PUP', 

there exist satisfiable constraints {dj}j^j, and 
for every j G J C {1, . . . ,m}, 

3v' G Wa{P') with v' ^ V such that 
v',A lb; —<GjDdj, and FET^ ^ (c — >■ ~>d\/ 

Lsuccp,(v) = {poc\ for allp{x) :-Gi,...,Gmad€ PUP', 

there exist satisfiable constraints {dj}j^j, and 
for every J G J C {1, . . . , m}, 

V, A lb; —iGjodj, and FETp; ^ (c — >■ ~id V \/ j^j djf^} 

Itp(A) is defined as follows. If w € fp'(IIGt(P')).' 

^ - 

/pp(_ 4 )(r<;) ={pnc \ there is {p{x) : — G nd^ \ l<k<n}CPU P' , and 
Vfc, 1 < A: < n,3u G Wa{P'), w = tp>{v) such that 

u, A lb; G^nd'^ and FETs \= {c ^ VLi 
IfweSuccpi{WA{P')), Itp(A){w) = {pac&lTp(A){w') \ 



Example 11. Let us see the construction of the least fixpoint of the program P 
given in example 9. The bottom A-structure is just a structure where, for every 
program P', we just have a world (T',0) whose interpretation is the empty set 
of atoms. Then, the least fixpoint is Tp(T): 
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f (0, {P, Q, s})0 


1 


J (0,{p,g})0 


({p ■ 


- -g>, {g})0 1 


9 1 

T%(±-) = { (0,{p,,})0 


({p : - --g}, {g}){p} i 


1 (0,0)0 


({p 


--g},0)0 / 


1 

1 (0,0)0 


({p : - --g}, 0)0 j 





f (0, {P, 9 , <>»{-•} 


1 




r (0, {p,9,o»{r} 1 


r|(-L) = 1 


(0,{p,9}){r> 


({p : - ^g}, {g}){p, s} 


1 rj,(±) = 1 


(0,{p,9}){r} ({P -9},{9»{p,r,9} i 

1 1 1 




1 

[ (0, 0)0 


1 

({p:--g},0)0 J 




[ (0,0)0 ({p:--g},0)0 j 



Z'-structures can be ordered according to the amount of information they con- 
tain. In particular, given two structures A and B, we consider that A :<f B if 
there is some mapping from the worlds in A to the worlds in B such that the 
positive and negative information in a world tc in ^ is included in the positive 
and negative information of the corresponding worlds in B. In general, this map- 
ping may associate to each w in A, not just a world in B, but a set of worlds. 
In addition, we ask this mapping to be downward surjective, which means that 
the worlds in A are surjectively mapped into a prefix of the set of worlds in B. 

Definition 12. For all A and B G Struct{S) , A diF B if, and only if, for every 
S-program P, there exists a map fp : W^{P) — >■ which is: 

i) Monotonic: 'iw G Wji^{P)'iv G fp{w),w < v and /^(ru) C Is{v), and 
a) Downward surjective: 'iw G Wj\^{P)'iw' G Wb{P) such thatVv G fp{w),w' ^ 
V then 3w" G Ilbt(P) with w" < w and w' G fp{w"). 

Remark 13. For every if-program P, if Tp is applied to a 27-structure A such 
that for every P' , W^{P') is a totally ordered structure, then tpi : W^{P') -G 
Wtp(A){P') ill Definition 10, is a downward surjective embedding. In addition, 
all tpi defined by the powers of Tp on _L, {Tp(_L) | 0 ^ k}, are monotonic. 

The previous relation is not an ordering when defined over the class of all 
27-structures. The problem is that antisymmetry may fail when considering non- 
constructive structures where there are infinite descending sequences of worlds. 
However, the following theorem shows that this relation is indeed an ordering 
when restricted to the subclass of Noetherian 27-structures i.e. structures that 
do not include infinite descending sequences of worlds. 

Theorem 14. is a complete partial order on Noetherian E-structures. 

Proof. is trivially reflexive. It is transitive because i) and ii) are preserved 
under composition. To show that it is antisymmetric we use parallel induction 
over the Noetherian order, of worlds of 27-structures, to prove that, if there 
exist monotonic and downward surjective maps fp : W^{P) — >■ and 

gp : Wb{P) — >■ then their compositions are the corresponding identities. 

The bottom is T = (Wj_, ^j_, /j_), where W± = {(P, 9)\P is a E— program}, 
^_L= {{w,w) I w G W±} and Vw G W±, I±{w) = {tnc | FETs \= c^}. Note that 
this bottom is trivially Noetherian. 

Finally, the least upper bound for every increasing chain of structures Ai :<p 
A 2 Ef ■ ■ ■) can be described as a “world-by-world union” where the correspon- 
dence between them is established by the maps (P) — >■ (P) * 
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Theorem 15. Tp, when restricted to the class of Noetherian E-structures, is 
monotonic and continuous with respect to <p so, it has a least fixpoint Tp f u). 

Proof. First of all, monotonicity is proved by showing that \/A,B € Struct(E) 
such that A Ef B and, V/p' : W^(P') — >■ satisfying Definition 12, the 

map gp> : 1 Ftp(_ 4 )(P') — >■ ^ defined as follows, also satisfies definition 

12 : 



- Vic G Wtp(A){P'), if w = tp'{v) then gp>{tp>{v)) = tp'(/p'(u)) 

— \/w G Wtp(a){P'): if 1 C = Succp/{v) then gpi{Succp'{v)) = Succpr{fp>{v)) 

Then, since Tp is monotonic, to prove continuity it is enough to prove that 
Tp is finitary, that is for any infinite chain of 27-structures A\ ^p A 2 Ef • ■ we 
have that Tp(U^j) Ef ^Tp{Ai). The proof proceeds in the standard way. First, 
one can show that any immediate consequence in Tp(U^i) is obtained using a 
finite set of literals £jndj from a finite set of worlds in Li Ai- Then, there will be 
a least upper bound An, in the chain {.4^}, including all these literals. ■ 

6 Least Model Semantics 

In this section we will prove that the least fixpoint of the immediate consequence 
operator Tpfw is the least model in Mod{P) with respect to a proper notion of 
ordering. The key issue here is to define an ordering relation in Mod{P), which 
we will denote by C, such that it adequately captures the intuition that the “best 
model” is the least one. The definition of this ordering is based, first, on the def- 
inition of an ordering between ordered structures associated to a given program 
P. Then this ordering is extended to compare 27-structures by comparing the 
ordered structures included. 

One may notice that, in an ordered structure associated to P, (if this struc- 
ture is part of a model of a program P') the negative information associated to 
a given world will contain, at most, the negative information supported by the 
worlds below. Similarly, the positive information associated to a given world will 
contain, at least, all the consequences that can be computed from the clauses 
in P and in P' and the negative information in the world. In this sense, one 
may consider that the “best ordered structure” is one in which the negative 
and positive information associated to each world is, respectively, the maximum 
and the minimum amount of possible information. This means that the ordering 
between ordered structures should be based on an extension of the, so-called, 
standard ordering of 3- valued structures. However, given two ordered structures 
B\{P) and , 62 (T), we should not try to compare pairwise all the worlds in one 
structure with the associated worlds in the other. For instance, let us suppose 
that P consists of the clause r : — ~<q and P' is empty. If q is not included in 
the interpretation of the world (T, 0) in Bi{P) then the world above may be 
(P, {g}) and its interpretation would include r. However, if q is included in the 
interpretation of (P, 0) in B^iP) then the world above can be (P, {r}). B\{P) 
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should be considered better than ,62 (P). This can be done by defining this or- 
dering among ordered structures as some kind of lexicographic extension of the 
standard ordering. 

However, the fact that ordered structures may be not linear poses some small 
additional difficulty: two worlds may be incomparable but, at the same time, be 
defined over the same set of worlds. Nevertheless, with the intuition discussed 
above, to compare B\{P) and B2{P) we proceed as follows. First, we look for 
two bars B 1 and B 2 in B\{P) and B2{P), respectively, in each structure, such 
that all the worlds and interpretations below coincide and such that all the 
worlds and/or interpretations in both bars are different. Then, if all the worlds 
and interpretations in B 1 are smaller (w.r.t. the standard ordering) than all the 
worlds and interpretations in B 2 , then we consider Bi{P) smaller than B2{P)- 
The following definitions capture these intuitions: 

Definition 16. Given a B -structure B = {W, :<,!), a B -program P and a strict 
bar B <Z W w.r.t. (P, 0 ), we define Bl= Ib\,) such that Wbi = {w G 

W{P) I gP and v ^w} and Ibi = {Kyw) I w G Wbi). 

Let Bi and B2 be B-structures, P a B -program, B\{P) and B2{P) the ordered 
structures associated to P in Bi and B2, respectively. Let Bi C Wi{P), i G {1, 2} 
be strict bars w.r.t. (P, 0 ). Then, B\ and B2 are separator bars w.r.t. B\{P) and 
B2{P) ifi d'nd only if, Bif= B2i and for any other strict bars P' C WfiP) w.r.t. 
(P, 0 ) such that G Pi 3 u' G B[: Vi < v'i, i € {1, 2 }, B[fy^ Bl^f. 

Separator bars are the bars, discussed above, that define where the differences 
in two ordered structures start. Separator bars are uniquely determined: 

Lemma 17. Let B\ = (lFi,^,/i) and B2 = (W2,fii,l2) be B-structures, P a 
B -program and B\{P), P2(P) the ordered structures associated to P in B\ and 
B2, and Pi,P2 separator bars w.r.t. Pi(P) and B2{P). Then Pi and P2 are 
unique. 

Notice that given a P-structure B, each bar P w.r.t. (P, 0 ) in P(P) induces a 
substructure {Wbi U P, /_b^ U {I{w) \ w G P}) which corresponds to an initial 
segment of the ordered structure B{P). For simplicity, in what follows, we will 
refer to this substructure as Bf UP. 

Definition 18. Let B\ and B2 be B-structures, P a B -program and B\{P) and 
B2{P) the ordered structures associated to P in Bi and B2, respectively. 

1 . Let Bi C Wi{P) be strict bars w.r.t (P, 0 ), i G {1,2}, Pi = P 2 if, and only 

if. Pi = P2 and Vw G Pi : Ii{w) = l2{w) 

2 . Given two strict bars Bi C WfiP) w.r.t (P,^), i G {1,2}, Pi Cf, B2 if, and 

only if. Pi = P2 or one of the following conditions holds: 

(a) 'iw\ G P1V1U2 G P2 : Pu)2 *— P-wi 

(b) Pi = P2 and \/w G Pi : Ii{w) C l2{w)- 

The strict order associated to this definition is denoted □{,. 

3 . B\{P) Us B2{P) if, and only if, there exist separator bars Bi C WfiP), 

i G {1,2} w.r.t. Bi{P) and P 2 (P) such that (a) or (b) holds: 
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(a) Bi = i ?2 and B 2 {P) = Bil UBi 

(b) Cb S 2 

The strict order associated to this definition is denoted 

Theorem 19. The relation Et over strict bars in ordered structures associated 
to a B-program in a B-structure is a partial order. 

Proof. Reflexivity is straightforward and transitivity is a quite direct conse- 
quence of transitivity of C. To prove antisymmetry assume that Bi C Wi{P), 
i £ {1,2} are strict bars w.r.t (P,0), such that Bi O/, B 2 and B 2 Eh i?i but 
they do not satisfy B 2 = B\. Then, B\ and B 2 only can be comparable by 
18.2a. Then, Vrci G B\ V 1 U 2 G B2 : Lyj.^ C and C T«,2- This is a 
contradiction. ■ 



Theorem 20 . The relation Es over ordered structures associated to a B -program 
P in a B-structures is a partial order. 

Proof. Reflexivity holds trivially. To prove antisymmetry suppose B\{P) Es 
B2{P) and B2{P) Es Bi{P). By lemma 17, both relationships are established by 
the same separator bars Bi and B 2 and B\ Eh B 2 and B 2 Eh B\. So, Bi = B 2 
and, Bi{P) = B2{P) by 18.3a (Bi{P) = B^f URi and B2(P) = B2i DB2). 
Transitivity is a consequence of lemma 17 and of transitivity of Eh- ® 

Example 21. In Example 9, we can see that B\{0) Es ^2(0)- In this case, the 
separator bars are Bi = {(0, {p, g})} and B 2 = {(0, { 9 })}, and Bi Eh B 2 because 
{ 9 } C {p,q} Also, Bi{{p : — -•q}) E B 2 HP ■ — ~'q}). Here, the separator bars are 
B'l = B'2 = {({p : - -■g},0)} and B[ Eh B'2 because /i((|p : - -■(7},0)) = 0 C 
hiiip ■■ - -■g}, 0)) = {p, q, r, sj. 

Now, we may define the order relation between A-structures. One obvious 
possible definition for such an ordering would consist in saying that B\ is smaller 
than B 2 if for every program P, B\{P) is smaller than B 2 {P). However, this 
definition would not be adequate. The problem is that, to decide what there 
should be in a given world for a program P we may need to look what information 
is included in the worlds associated to a different program P' . The reason is that 
a certain clause in P may include an intuitionistic implication. This means that 
before comparing the ordered structures associated to P one should compare the 
ordered structures associated to P' . Again, this means that the ordering over 
A-structures should be a kind of lexicographic extension of the ordering on the 
structures associated to programs. 

Definition 22 . Given Bi G Struct{B), i G {1,2}, Bi E B 2 if, and only if, for 
every B -program P one of the following conditions holds: 

1. B^{P) B,B2{P) 

2. There exists a program P' , P Q P' <2 flat{P) such that Bi{P') Es B 2 {P') 
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where flat{P) denotes the program consisting of all the clauses in P and in all 
the programs flat{Q), where Q occurs in the left-hand-side of a clause in P. 



Theorem 23. The relation \Z is a partial order in Struct{E). 

Proof. Reflexivity is a consequence of reflexivity of To prove antisymmetry 
we can consider the four cases resulting of combining 22.1 and 22.2. Combining 
just 22.1 directly leads to the equality of structures, and, it is not difficult to 
see that any other case leads to a contradiction. Again we can consider the four 
cases to prove transitivity. Now, combining just 22.2 let to a contradiction, and, 
the other cases hold by transitivity of ■ 



Theorem 24. For any E -program P, Tpfu) is the Q-least model in Mod{P). 

Proof. The proof uses double induction on the iterations of Tp to prove that 
for every B G Mod{P) and for every k G IN, Tp(-L) E B; and, then, on the 
C-chain of programs (ordered structures) in the A-structures Tp{E) and B to 
prove that conditions 22.1 and 22.2 hold. The base case is quite simple. The 
proof of Tp''"^(_L) E B from Tp{E) E ^ is a little long due to technicalities 
but the essential idea is quite simple and direct: It is enough to consider that 
the definition of Tp for worlds and interpretations are, respectively, if-and-only- 
if versions of the notions of supported models and local forcing of clauses in 
Definition 7. ■ 



7 Soundness and Completeness 

In this section we will prove the equivalence between the operational and the 
model-theoretic semantics defined above. In particular this means showing the 
soundness and completeness of the operational semantics of a program P with 
respect to the least fixpoint of the immediate consequence operator Tp. 

Theorem 25 (Soundness). If P \- Gnc can be proved with computed answer 
c' , then (0,0),Tptw Ih Gnc' . 

Proof. We prove that for every 27-program Q, if PUQ \- Gnc can be proved with 
computed answer c', then (Q,0),Tpto; Ih Gnc'. The proof proceeds by induction 
on the number of derivation (plus subderivations) steps. The base step, when 
the 27-expression in the goal is of the form nc, is trivial. To prove the inductive 
step there are four cases to consider depending if the goal is negative and if it 
has embbeded implication. The proofs for each of these cases are very similar 
and standard. Using the corresponding operational rule, the inductive hypothesis 
and the definition of Tp we conclude that (Q,0),Tpta; Ih Gnc'. ■ 



Theorem 26 (Completeness). If {(l),(l)),Tpfuj Ih Gnc, then P h Gnc can be 
proved with computed answers ci, . . . , c„ such that FETs |= (c Vr=i ^i)'^ ■ 
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Proof. As in the previous theorem, we prove that for every A-program Q if 
{Q, 0), Tpfuj Ih Gnc, then P U Q \~ Gnc can be proved with computed answers 
Cl, . . . , c„ such that FETs ^ (c — >■ V"=i ■ Again the proof is very standard 
and uses induction on the iterations of Tp. The base step, fc = 0, is trivial. 
For proving the inductive step (in the same four cases) it is enough to use the 
corresponding operational rule, the inductive hypothesis and the definition of 

Tp. m 
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Abstract. Multi-adjoint logic programs has been recently introduced [9, 
10] as a generalization of monotonic logic programs [2, 3], in that si- 
multaneous use of several implications in the rules and rather general 
connectives in the bodies are allowed. 

This paper discusses abductive reasoning — that is, reasoning in which 
explanatory hypotheses are formed and evaluated. To model uncertainty 
in human cognition and real world applications; we use multi-adjoint 
logic programming to introduce and study a model of abduction problem. 



1 Introduction 

Broadly speaking, abduction aims at finding explanations for, or causes of, ob- 
served phenomena or facts; it is inference to the best explanation, a pattern 
of reasoning that occurs in such diverse places as medical diagnosis, scientific 
theory formation, accident investigation, language understanding, and jury de- 
liberation. More formally, abduction is an inference mechanism where given a 
knowledge base and some observations, the reasoner tries to find hypotheses 
which together with the knowledge base explain the observations. Reasoning 
based on such an inference mechanism is referred to as abductive reasoning. 

Abductive reasoning has been recognized as an important form of reasoning 
with incomplete information that is appropriate for many problems in Artificial 
Intelligence. These problems include updates in databases, belief revision, plan- 
ning, diagnosis, natural language understanding, default reasoning, user mod- 
elling and, in general, problems requiring reasoning with incomplete information. 

The purpose of this work is to provide a theoretical framework for abduction 
in multi-adjoint logic programming [9]. The special feature of multi-adjoint logic 
programs is that it is possible to use a number of different implications in the 
rules of our programs. Specifically, the language and semantics of monotonic 
logic programs are generalized in order to encompass more complex rules. For 
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simplicity in the presentation, only the propositional (ground) case will be con- 
sidered. A whole class of abduction problems with uncertainty expressed within 
the language of multiadjoint programs can be solved by our method. 

A general theory of logic programming which allows the simultaneous use of 
different implications in the rules and rather general connectives in the bodies is 
presented in [9] , where models of these programs are proved to be post-fixpoints 
of the immediate consequences operator, which turns out to be monotonic un- 
der very general hypotheses. In addition, the continuity of the immediate con- 
sequences operator is studied, and some sufficient conditions for its continuity 
are obtained. A procedural semantics, under these conditions, for multi-adjoint 
logic programs, together its completeness result was given in [10]. 

The structure of the paper is as follows: In Section 2, the preliminary defini- 
tions are introduced; later, in Section 3, the syntax and semantics of multi-adjoint 
logic programs are given, and the results about the continuity of the immediate 
consequences operator are presented. In Section 4, the procedural semantics of 
multi-adjoint logic programs is defined and the completeness results are stated. 
Section 5 is the main part of this paper. We give definitions of the abduction 
problem and of correct and computed explanations. We prove soundness and 
completeness of our abduction semantics. The computation of the cheapest ex- 
planation wrt a price function can be implemented, in determined lattices, by a 
logic programming computation followed by a linear programming optimization. 
The paper finishes with some conclusions and pointers to future work. 

2 Preliminary Definitions 

In order to make this paper as self-contained as possible, the preliminary defini- 
tions required to formally define multi-adjoint logic programs are given in this 
section, which contains the approach given in [2,3,9]. 

We assume the reader is familiar to constructions and terminology of univer- 
sal algebra such as graded set, 12-algebra and subalgebra of an 17-algebra, which 
are used to define formally the syntax and the semantics of the languages we 
will deal with. 

The main concept we use in this section is that of adjoint pair, firstly intro- 
duced in a logical context by Pavelka [12], who interpreted the poset structure 
of the set of truth- values as a category, and the relation between the connectives 
of implication and conjunction as functors in this category. The result turned 
out to be another example of the well-known concept of adjunction, introduced 
by Kan in the general setting of category theory in 1950 (see also the notion 
of a relative pseudo-complement in lattice theory e.g. in Rasiowa and Sikorski’s 
’Mathematics of matamathematics’ (1968)). 

Definition 1 (Adjoint pair). Let {P, be a partially ordered set and (^, &) 
a pair of binary operations in P such that: 

(al) Operation & is increasing in both arguments, i.e. if x\, X 2 ,y & P such that 
xi di X 2 then (xi&y) ^ (x 2 ky) and (ykxi) ^ iykx 2 ); 
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(a2) Operation ^ is increasing in the first argument (the consequent) and de- 
creasing in the second argument (the antecedent), i.e. if xi,X 2 ,y G P such 
that xi :< X 2 then {x\ t— j/) ^ {x 2 ^ y) and (j/ 1— X 2 ) ^ (y t— xi); 

(a3) For any x,y,z € P, we have that x < (y z) holds if and only if {x Szz) ^ 
y holds. 

Then we say that (t— ,&) forms an adjoint pair in {P, :<). 

The property (a3) corresponds to the categorical adjointness; and can be 
adequately interpreted in terms of multiple- valued inference as both the assertion 
that the truth-value of y •<— z is the maximal x satisfying x&z :<p y, and the 
validity of a generalized modus ponens rule [5]. 

Extending the results in [2,3,13,14] to a more general setting, in which differ- 
ent implications (Lukasiewicz, Godel, product) and thus, several modus ponens- 
like inference rules are used, naturally leads to considering several adjoint pairs 
in the lattice. More formally. 

Definition 2 (Mnlti- Adjoint Semilattice). Let {L,<) be a cus-lattice. A 
multi-adjoint semilattice L is a tuple (L, ^ 1 , &i, . . . , satisfying the 

following items: 

(11) {L,<) is hounded, i.e. it has bottom (_L) and top (T) elements; 

(12) (<~i, &i) is an adjoint pair in {L, ;<) for i = 1, . . . ,n; 

(13) T&jd = -d&iT = d for all d £ L for i = 1, . . . , n. 

Remark 1. Note that residuated lattices are a special case of multi-adjoint semi- 
lattice, in which the underlying poset has a cus-lattice structure, has monoidal 
structure wrt 0 and T, and only one adjoint pair is present. 

From the point of view of expressiveness, it is interesting to allow extra 
operators to be involved with the operators in the multi-adjoint semilattice. The 
structure which captures this possibility is that of a multi-adjoint algebra. 

Definition 3 (Multi- Adjoint L?- Algebra). Let fl be a graded set containing 
operators <~i and Szi for i = 1, . . . ,n and possibly some extra operators, and let 
£ = (L,L) be an fl-algehra whose carrier set L is a cus-lattice under 
We say that 2 is a multi-adjoint 17-algebra with respect to the pairs 
for i = 1, . . . , n if C = {L, J(^i), /(&i), . . . , I (^n) , I i&Zn)) is a multi-adjoint 

semilattice. 

In practice,the extra operators will be assumed to be either conjunctors or 
disjunctors or aggregators. 

Example 1. Consider 17 = {^p, &p, SzG,d\L, @}, the real unit interval U = 
[0, 1] with its lattice structure, and the interpretation function / defined as: 

L{^p){x,y) = min{l,x/y) L{Szp){x,y) =x-y 

I{^G){x,y) = {^ I{kG){x,y) =mm{x,y) 

I X otherwise 

I{@){x,y,z) = l(x -\- 2y -\- 3z) 



L{AL){x,y) = max;(0,x -I- y - 1) 
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that is, connectives are interpreted as product and Godel connectives, a weighted 
sum and Lukasiewicz conjunction; then (C/, I) is a multi-adjoint 17-algebra with 
one aggregator and one additional conjunctor (denoted to make explicit that 
its adjoint implicator is not in the language). □ 

The syntax of the propositional languages we will work with will be defined 
by using the concept of 17-algebra. To begin with, the concept of alphabet of the 
language is introduced below. 

Definition 4 (Alphabet). Let Q he a graded set, and II a eountahly infinite 
set. The alphabet associated to 17 and II is defined to be the disjoint union 

17 U 77 U S', where S is the set of auxiliary symbols “)” and 

In the following, we will use only Aq to designate an alphabet, for deleting the 
reference to 77 cannot lead to confusion. 

Definition 5 (Expressions). Given a graded set 17 and alphabet Aq. The f2- 
algebra € = {Aq* , I) of expressions is defined as follows: 

1. The carrier Aq* is the set of strings over Aq. 

2. The interpretation function I satisfies the following conditions for strings 
oi, . . . ,a„ in Aq*: 

— C(t = c, where c is a constant operation (c G 17 q ). 

— cve(ai) = ojai, where u> is an unary operation (u G fl\). 

— wg(ai,a 2 ) = (aio;a 2 ), where ui is a binary operation (to G 172 j. 

— LOit{ai , . . . , a„) = u){ai , . . . , a„), where to is a n-ary operation (to G I7„j 
and n > 2. 

Note that a expression is only a string of letters of the alphabet, that is, 
it needn’t be a well-formed formula. Actually, the well-formed formulas is the 
subset of the set of expressions defined as follows: 

Definition 6 (Well- formed formulas). Let f2 be a graded set, 77 a countable 
set of propositional symbols and 2: the algebra of expressions corresponding to 
the alphabet Aq^q- The well- formed formulas (in short, formulas) generated by 
17 over 77 is the least subalgebra ^ of the algebra of expressions € containing 77. 

The set of formulas, that is the carrier of j?, will be denoted Fq. It is well- 
known that least subalgebras can be defined as an inductive closure, and it is 
not difficult to check that it is freely generated, therefore it satisfies the unique 
homomorphic extension theorem. 

3 Syntax and Semantics of Mnlti-adjoint Logic Programs 

Multi-adjoint logic programs are constructed from the abstract syntax induced 
by a multi-adjoint algebra on a set of propositional symbols. Specifically, we 
will consider a multi-adjoint 17-algebra £ whose extra operators are either con- 
junctors, denoted Ai, . . . , Afc, or disjunctors, denoted Vi, . . . , Vj, or aggregators. 
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denoted @i, . . . , @m- (This algebra will host the manipulation the truth- values 
of the formulas in our programs.) 

In addition, let 77 be a set of propositional symbols and the corresponding 
algebra of formulas freely generated from 77 by the operators in 17. (This 
algebra will be used to define the syntax of a propositional language.) 

Remark 2. As we are working with two 17-algebras, and to discharge the nota- 
tion, we introduce a special notation to clarify which algebra an operator belongs 
to, instead of continuously using either or Let uj be an operator symbol 
in 17, its interpretation under £ is denoted u> (a dot on the operator), whereas 
u itself will denote wj when there is no risk of confusion. 

The definition of multi-adjoint logic program is given, as usual, as a set of 
rules and facts. The particular syntax of these rules and facts is given below: 
Definition 7 (Multi- Adjoint Logic Programs). A multi-adjoint logic pro- 
gram is a set P of rules of the form ((A B), d) such that: 

1. The rule {A B) is a formula of 

2. The confidence factor § is an element (a truth-value) of L; 

3. The head of the rule A is a propositional symbol of II. 

4- The body formula B is a formula of ^ built from propositional symbols 
7?i, . . . , B„ (n > Oj by the use of conjunctors &i, . . . , &„ and Ai, . . . , Afc, 
disjunctors Vi, . . . , V; and aggregators @i, . . . , @m ■ 

5. Facts are rules with body T. 

6. A query ( or goalj is a propositional symbol intended as a question 7 A promp- 
ting the system. 

Note that an arbitrary composition of conjunctors, disjunctors and aggregators 
is also an aggregator. 

Sometimes, we will represent the above pair as A @[7?i, . . . , 77„], where^ 
Bi, ... , Bn are the propositional variables occurring in the body and @ is the 
aggregator obtained as a composition. 

Example 2. The following program P, where the subscripts G, L, P on the con- 
nectives mean Godel, Lukasiewicz and product connectives, is an example of a 
[0,l]-valued multi-adjoint logic program consisting of five rules and three facts. 



high_fuel_consumption 


0.8 


richunixture Ai low.oil 


(1) 


overheating 


0.5 

G-p 


low.oil 


(2) 


noisy_behaviour 


0.8 


richjnixture 


(3) 


overheating 


0.9 


low.water 


(4) 


noisy .behaviour 


1 


low.oil 


(5) 


low.oil 


0.2 




(6) 


low.water 


0.2 

^P 




(7) 


richnnixture 


0.5 

^P 




(8) 



Note the use of square brackets. 



1 
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This program is intended to represent some general knowledge about the be- 
haviour of a car. 

Definition 8 (Interpretation). An interpretation is a mapping /:7T — L. 
The set of all interpretations of the formulas defined by the f2-algehra ^ in the 
n -algebra £ is denoted Iz- 

Note that by the unique homomorphic extension theorem, each of these inter- 
pretations can be uniquely extended to the whole set of formulas Fq. 

The ordering A of the truth-values L can be easily extended to the set of 
interpretations as usual: 

Definition 9 (Semilattice of interpretations). Consider two interpretations 
I\,l 2 G Tfi- Then, {Tz,C.) is a cus-lattice where I\ C I 2 iff I i{p) F hip) for all 
p € n . The least interpretation A maps every propositional symbol to the least 
element T of L. 

A rule of a multi-adjoint logic program is satisfied whenever the truth-value 
of the rule is greater or equal than the confidence factor associated with the rule. 
Formally: 

Definition 10 (Satisfaction, Model). Given an interpretation I & Tz, a 
weighted rule {A B, ■&) is satisfied by I iff t) < I {A B) . An interpre- 
tation I G Iz is a model of a multi-adjoint logic program P ijf all weighted rules 
in P are satisfied by I. 

Note the following equalities 

i{A B) = i{A) A, i{B) = I (A) A, i{B) 

and the evaluation of I{B) proceeds inductively as usual, till all propositional 
symbols in B are reached and evaluated under I. For the particular case of a 
fact (a rule with T in the body) satisfaction of {A T,-d) means 

d ^ i{A T) = I{A) 

by property (aS) of adjoint pairs this is equivalent to T A I [A) and this by 
assumption (13) of multi-adjoint semilattices gives § I {A). 

Definition 11. An element X G L is a correct answer for a program P and a 
query ?A if for an arbitrary interpretation I-.U^L which is a model o/P we 
have A ^ I (A). 

Example 3. The interpretation / defined by 

/(low_oil) = 0.25 
/(low_water) = 0.35 
/(overheating) = 0.45 
/(richjnixture) = 0.90 
/(noisy_behaviour) = 0.75 
/(high_fuel_consumption) = 0.55 

is a model of the program given in Example 2. 

If we add the query /overheating to the program, then 0.1 is a correct 
answer. Actually, it is the greatest correct answer for the query. 
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The immediate consequences operator, given by van Emden and Kowalski 
in [15], can be generalised to the framework of multi-adjoint logic programs as 
follows: 

Definition 12. Let ¥ be a multi-adjoint logic program. The immediate conse- 
quences operator — >■ mapping interpretations to interpretations, is 

defined by considering 

T^{I){A) = snp[i)kii{B) I 

Note that all the suprema involved in the definition do exist because L is assumed 
to be a cus-lattice. 

As it is usual in the logic programming framework, the semantics of a multi- 
adjoint logic program is characterized by the post-fixpoints of Tp. 

Theorem 1 ([9l). An interpretation I of To is a model of a multi-adjoint logic 
program P iffT^{I) C /. 

Note that the fixpoint theorem works even without any further assumptions 
on conjunctors (definitely they need not be commutative and associative). 

The monotonicity of the operator Tp, for the case of only one adjoint pair, 
has been shown in [2]. The proof for the general case is similar. 

Theorem 2 ([9]). The operator Tp is monotonic. 

Due to the monotonicity of the immediate consequences operator, the se- 
mantics of P is given by its least model which, as shown by Knaster-Tarski’s 
theorem, is exactly the least fixpoint of Tp, which can be obtained by trans- 
finitely iterating Tp from the least interpretation A. 

It is worth to investigate conditions which make the Tp operator to be con- 
tinuous, in [9] it was proved that whenever every operator in fi turns out to 
be continuous in the lattice, then Tp is also continuous and, consequently, its 
least fixpoint can be obtained by a countably infinite iteration from the least 
interpretation. Formally, 

Theorem 3 ([9]). If all the operators occurring in the bodies of the rules of a 
program P are continuous, and the adjoint conjunctions are continuous in their 
second argument, then Tp is continuous. 

4 Procedural Semantics of Multi-adjoint Logic Programs 

Once shown that the Tp operator can be continuous under very general hypothe- 
ses, then the least model can be reached in at most countably many iterations. 
Therefore, it is worth to define a procedural semantics which allow us to actually 
construct the answer to a query against a given program. 

In the following, we work in a hybrid L?-algebra made up from the elements 
of the lattice, and the same alphabet of the language but the adjoint implicators. 

For the formal description of the computational model, we will consider an 
extended the language defined on the same graded set, but whose carrier is 
the disjoint union IILIL; this way we can work simultaneously with propositional 
symbols and with the truth- values they represent. 
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Definition 13. Let ¥ be a multi-adjoint logic program on a multi-adjoint fl- 
algebra 2 with carrier L and V the set of truth values of the rules in P. The 
extended language is the corresponding f2-algebra of formulas freely generated 
from the disjoint union of II and V. 

The formulas in the language S' will be referred as extended formulas. An oper- 
ator symbol to interpreted under S' will be denoted as w. 

Our computational model will take a query (an atom), and will provide a 
lower bound of the value of A under any model of the program. Intuitively, 
the computation proceeds by, somehow, substituting propositional symbols by 
lower bounds of their truth- value until, eventually, an extended formula with no 
propositional symbol is obtained, which will be interpreted in the multi-adjoint 
semilattice to get the computed answer. 

Given a program P, we define the following admissible rules for transforming 
any extended formula. 

Definition 14. Admissible rules are defined as follows: 

1. Substitute an atom A in an extended formula by whenever there 

exists a rule {A<—iB, H) in P. 

2. Substitute an atom A in an extended formula by _L. 

3. Substitute an atom A in an extended formula by d whenever there exists a 
fact (A<— iT,-d) in P. 

Note that if an extended formula turns out to have no propositional symbols, 
then it can be directly interpreted in the multi-adjoint f2-algebra £. This justifies 
the following definition of computed answer. 

Definition 15. Let V be a program in a multi-adjoint language interpreted on 
a multi-adjoint semilattice L and let ?A be a goal. An element @[ri, . . . , r^], 
with ri € L, for all i G {1, . . . , m} is said to be a computed answer if there is a 
sequence Gq, ... , Gn+i such that 

1. Gq = A and G„+i = @[ri, . . . , r^] where Vi G L for all i = 1, . . . n. 

2. Every Gi, for i = 1, . . . ,n, is a formula in S' ■ 

3. Every Gj+i is inferred from Gi by one of the admissible rules. 

The idea of the computation is to consecutively apply admissible rules until 
an extended formula with no propositional symbols @[ri,...,rm] is obtained, 
which can be interpreted as the element @[ri, . . . , r™] in the lattice C. 

An alternative formalism of Generalized Annotated Logic Programs (GALP) 
was introduced in [7]. The procedural semantics of GALP uses a GLP-like pro- 
cedure to solve a set of lattice inequalities to find a computed answer. Our pro- 
cedural semantics replaces constraints in the form of inequalities by equalities 
building a final formula for @ which is the best answer. 

It might be the case that for some lattices it is not possible to get the correct 
answer, simply consider L to be the powerset of a two-element set {a, b} ordered 
by inclusion. The requirement of the reductant property stated below will allow 
us to avoid these cases, see [7,11] for details. 
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Definition 16 (Reductant, reductant property). Let V be a program on a 
multi-adjoint Q-algehra with values in a multi-adjoint semilattice £,; assume 
that all the rules in P with head A are A Bi for i = 1, . . . ,k. A reductant 
for A is a rule A @(Bi, . . . ,B„) such that for any bi, . . . ,bk we have 

sup{i9i kibi I i = 1, . . . ,n} = ...,bk) 

A program P is said to have the reductant property if there exist reductants for 
any atom A occurring in the head of some rule. 

Note that -d and @ should depend only on the (multi-)set of ki- 

Certainly, it will be interesting to consider only programs which contain all 
its reductants, but this might be a too heavy condition on our programs; the 
following proposition shows that we can assume that our programs contain all 
the reductants, because the set of models is preserved. 

'd 

Proposition 1 ([10]). Any reductant A <— B of F is satisfied by any model 
o/P. In short, ¥ \= A B. 

As a consequence of the proposition above, we can assume that a program 
contains all its reductants, since its set of models is not modified. 

Definition 17. Given a program P with the reductant property and a query 
lA, the greatest computed answer is a computed answer in which calculation 
admissible rules 1 and 3 are applied only with rules (and facts) reductants in P, 
and the admissible rule 3, is applied if and only if no rule/fact exists for a given 
atom in the extended formula. 

Theorem 4. Given a program P, a query lA and a computed answer A'. If \ 
is the greatest computed answer, then A' < A. 

The theorem above shows that computed answers as in the previous definition 
are actually the greatest. 

Theorem 5. Given a program P with the reductant property, for all atom A let 
A^ be the greatest computed answer for P and query lA, then Xa A Tj^(A)(A). 

It was proved in [10] that, given a program P, then Tp (a)(A) is a computed 
answer for all n and for all query 7 A. Now, in conjunction with the result above 
we straightforwardly obtain the to following corollaries which will be used later. 

Corollary 1. X € I is the greatest computed answer for program P and query 
7 A if and only if X = Tp (a)(A). 

To finish the section, simply recall the following result from [10]. 

Corollary 2. X € I is a correct answer for program P and query 7 A if and only 
^/A^Tp“(A)(A). 
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5 Abduction in Multi-adjoint Logic Programs 

To state the intuition behind an abduction problem, if will use the program 
in Example 2, but without the three facts; that is, we have no information 
about variables richjnixture, low_oil, low_water, which will turn out to be 
hypotheses to explain the behaviour of our car. 

If we notice it is noisy, overheated and has a high fuel consumption, we would 
like to know why it is so. Let us call these assertions observation variables and 
let us denote it by 

OV = {noisy_behaviour, overheated, high_fuel_consumption}. 

As noisiness can be subjective and the height of fuel consumption and tempera- 
ture of the engine can take different (high) values, our observations are estimated 
(by an expert) by confidence factors. So the second parameter of our abduction 
problem are observations (sometimes called manifestations, symptoms, effects) 
represented by a theory, i.e. a partial mapping OBS: OV — >■ L consisting of facts, 
which can be thought of as observation variables equipped with confidence fac- 
tors 

high_fuel_consumption ^P, overheating -^—p, noisy_behaviour C-p 

Note that it is not necessary to assume a specific type of implication for obser- 
vations - any will do. The full strength of multi-adjointness is needed for the 
logic programming part where specific implication describes a specific action of 
the truth value of the rule. 

We would like to find explanations (causes) for given observations (symp- 
toms) by means of semantical consequence. Namely, explanations are assertions 
which together with domain knowledge forces every model of them to be also a 
model of observations. That is, whenever in a real world situation represented 
by an interpretation /, both domain knowledge and explanations are true, then 
all observations are true in I. 

Possible explanations will be L-fuzzy subsets of the set of hypotheses 

H = {richjnixture, low_oil, low_water} 

This makes sense, because in the realm of a theory and of observations suffering 
from uncertainty, we can expect that certain level of confidence of hypotheses 
can be (under the presence of the theory) an explanation of these (uncertain) 
observations. The formal definitions of abduction problem, explanation, etc. are 
given below. 

Definition 18. An abduction problem consists of a tuple A = (P,OBS, H) , 
where 

1. P zs o multi-adjoint logic program. 

2. H C n is the set of hypotheses. 

3. OBS: OV L is the L-fuzzy theory of observations (where OV is a set of 

observation variables such that OV C\ H = 0) 
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The intended meaning of OV f] H = 0 is that observation variables should not 
be explained by themselves. 

A theory is a mapping assigning formulas a truth value. For two theories T 
and S, let T U S' denote the union of them as a theory defined by 
(T U S) ( A) = max{T( A) , S( A) } 



Definition 19. An L-fuzzy theory E: H L is a correct explanation to an 
abduction problem A = (P, OBS, H) if 

1. FU E is satisfiable. 

2. PUA semantically implies OBS, that is every model ofFUE is also a model 
ofOBS. 

It will be useful to represent explanations as a subset of L”, especially when 
L = [0,1]- li H = {hi, . . . , hn} is the set of hypotheses and E:H — >■ L is an 
explanation, it is uniquely determined by its values 

{E{hi),...,E{hn))€L^ 

so, an element e = (ei, . . . ,£„) G L” represents the mapping E^{hi) = e*. 

The set of correct explanations for A will be denoted by SOLd{A), where 
the subscript d resembles the declarative character of the explanations. 

Using our motivating example we are illustrating a possible solution of a 
whole class of problems which are formulated within our formalism. 

Example 4- Having our motor vehicle example and our multi-adjoint program P 
and the query ?high_fuel_consumption by multi-adjoint logic program compu- 
tation, using the first program rule we get 

min(0.8, max(0, richjnixture -f- low_oil — 1)) 

Now similarly as in the two valued logic abductive logic programming [6], our 
procedure instead of failing in a proof when a selected subgoal fails to unify with 
the head of any rule, the subgoal is viewed as a hypothesis, that is, if we know 
confidence factor for rich mixture and low oil (from an explanation) we have the 
computed answer for high fuel consumption. To fulfil 

Oi?S'(high_fuel_consumption) > 0.25 

it should be 

min(0.8, max(0, richjnixture + low_oil — 1)) > 0.25 



hence 



richjnixture + low_oil > 1.25 

Note that if OHS'(high_fuel_consmnption) would be greater than 0.8, there is 
no explanation for this. To overcome this we could think of the possibility of 
calculating the truth value of the metamathematical assertion “E is an expla- 
nation for A”. Here we consider the case, when this truth value is 1, that is, E 
is an explanation with full confidence. This is why we are calculating the truth 
value of the following implication: Whenever / is a model of P U A then / is a 
model of OBS. In particular the truth value of the implication 
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“If /(richjnixture) + /(low_oil) > 1.25 

then /(high_fuel_consmnption) > 0.25” 

And again, the truth value of x < y can be calculated as d-(y, x). 

So our multi-adjoint logic programming abduction should run as a usual logic 
program with two exceptions 

— it successfully ends without resolving variables which are in the set of hy- 
potheses 

— it is prompted by a query with threshold, which can serve as a cut (as we 
will see later). 

Definition 20 (Procedural semantics for abduction). Let us have an ab- 
duction problem A — (P, OBS, H) and consider m € OV. A successful abduc- 
tion for A and m is a sequence Q = (Go, Gi, . . . , G;) of extended formulas in the 
language of multi-adjoint logic program computations such that: 

1. Go = TO, 

2. Gi contains only variables from H , and 

3. (a) For all i <1, Gi+i is inf erred from Gi by one of admissible rules, and 

(b) For the constant interpretation — >■ {T} the inequality Ii(Gi_|_i) > 

OBSfm) holds. 

The last condition is to be understood as a cut, because it allows to estimate 
the best possible computation of remaining propositional variables. 

This definition allows the explanation of a single observation variable. Merg- 
ing of several observation variables is in the Definition 21 bellow. Solutions are 
obtained as a combination (intersection) of the above set through all to’s in ob- 
servation variables. Moreover, the set of all solutions is the union through all 
possible combinations of all possible computational branches for all observation 
variables. 

For H = {hi , . . . hn}, the expression G/ can be understood as a function of 
n variables from L” to L and that is why can denote it by G; = Qm(hi , . . . , hn). 

Theorem 6 (Existence of solutions). Let A = {F,OBS,H) be an abduction 
problem and assume that for each to G OV there is a successful abduction for A 
and TO. Then SOLd{A) yf 0. 

Our definition of abduction gives us the possibility to define computed ex- 
planations for abduction problems A. 

Definition 21. A tuple e = (ei, . . . ,e„) is a computed explanation for an ab- 
duction problem A = (P, OBS, H) if for every to G OV there is an abduction 
Gm for A and m such that 

Gm{ei, ...,£„)> OBS{m) 

The set of all computed explanations will be denoted by SOLp{A), where the 
subscript p resembles the procedural character of this definition. 
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Example 5. In our motor vehicle example we calculated that first rule of P gives 
richjmixture -f low_oil > 1.25, similarly second rule gives low_oil > 0.5 and 
the third gives rich_mixture > 0.625, hence the set (coordinates are ordered as 
(richjnixture, low_oil, low_water) ) 

{(£ri,£ 2 ,£ 3 ) G [0, 1]^ : £i + £2 > 1-25 and £1 > 0.625 and £2 > 0.5} 
is a subset of SOLp{A). 

This example shows also that the area got as an intersection of some compu- 
tational branches for m’s has the shape of a convex body. Moreover, the set of all 
solutions is the union of such areas. It seems reasonable that, to get the cheapest 
answer, we have just to run a linear programming optimization separately on 
each of these areas. 

Theorem 7 (Soundness). Assume A is a definite abduction problem, then 
SOLp{A) C SOL^{A) (that is, every computed explanation for A is also a 
correct explanation). 

In the completeness theorem below we need the assumption that our logical 
program has a finite computational tree, according to abductions. This is very 
often the case in practical applications of abduction, because e.g. observations 
should not be explained by themselves, and most of logic programs for abduction 
are layered. Moreover if the conjunctors are archimedean (also very often) then 
the abduction ends below the observation value threshold, and hence is cut. 

Theorem 8 (Completeness). Assume A is an abduction problem and the 
logical program has a finite computational tree according to abductions, then 
SOLd{A) C SOLp{A) (that is, every correct explanation for A is also a com- 
puted explanation). 

From now on we do not have to distinguish between two sets of solutions and 
we simply denote it SOL{A). 

We comment here very briefly the possibility of using linear programming in 
some cases to obtain the cheapest explanation to an abduction problem wrt a 
given cost function. 

Example 6. Continuing the example of motor vehicles assume that checking 
(£i,£2,£s) G S0L{A) costs 2£i -l- £2 + 0.l£3. The space of solutions SOL{A) 
is bounded by linear surfaces in [0, 1]^ and is the union of four convex bodies, 
obtained from the combinations of rules in the program for each observation 
variable. 

Using the first three rules of the program P we get one of these four convex 
bodies, namely the set 

Si = {(£ 1 , £ 2 , £ 3 ) G [0, 1]^ I £1 + £2 > 1-25 and £1 > 0.625 and £2 > 0.5} 

Applying a linear programming method for S\ wrt our cost function we get in 
this convex body a minimal solution (0.625, 0.625, 0) at cost of 1.875. 

Actually, the cheapest solution of our running abduction problem A is emin = 
(0.25, 1, 0.35) at cost of 1.535, obtained from the convex body of all solutions to 
the first, fourth and fifth program rule. 
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Eiter and Gottlob showed in [4] that deciding SOLd{A) ^ 0 for two valued 
logic is complete for complexity classes at the second level of polynomial hier- 
archy, while the use of priorisation raises the complexity to the third level in 
certain cases (for arbitrary propositional theories). 

One can ask how difficult is to decide, in our framework, whether e G SOL(A) 
or not. Now we see that it substantially depends on the complexity of logic 
programming computation and the complexity of functions evaluating the truth 
values for connectives; thus, from a computational point of view, it makes sense 
to use simple connectives (e.g. linear in each coordinate, as product is, or even 
partly constant). So assuming connectives are easy this problem is in NP. 

Regarding the linear programming approach to the cheapest explanations, as 
linear programming is polynomial and prolog is in NP, to find minimal solutions 
for an abduction problem (assuming connectives are coordinatewise linear) does 
not increase the complexity and remains in NP. 

6 Conclusions and Future Work 

In this work a theoretical framework for abduction in multi-adjoint logic pro- 
gramming is introduced; a sound and complete procedural semantics has been 
defined, and the possibility of obtaining the cheapest possible explanation to 
an abduction problem wrt a cost function by means of a logic programming 
computation followed by a linear programming optimization has been shown. 

Future work on this area will be concerned with showing the embedding 
of different approaches to abduction in our framework, as well as the study of 
complexity issues in lattices more general than the unit interval. 
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Abstract. We advocate a declarative approach to proving properties 
of logic programs. Total correctness can be separated into correctness, 
completeness and clean termination; the latter includes non-floundering. 
Only clean termination depends on the operational semantics, in par- 
ticular on the selection rule. We show how to deal with correctness and 
completeness in a declarative way, treating programs only from the logi- 
cal point of view. Specifications used in this approach are interpretations 
(or theories). We point out that specifications for correctness may dif- 
fer from those for completeness, as usually there are answers which are 
neither considered erroneous nor required to be computed. 

We present proof methods for correctness and completeness for definite 
programs and generalize them to normal programs. The considered se- 
mantics of normal programs is the standard one, given by the program 
completion in 3- valued logic. 

The method of proving correctness of dehnite programs is not new and 
can be traced back to the work of Clark in 1979. However a more com- 
plicated approach using operational semantics was proposed by some 
authors. We show that it is not stronger than the declarative one, as far 
as properties of program answers are concerned. 



1 Introduction 

This paper discusses reasoning about logic programs in terms of their declarative 
semantics. We view total correctness of programs as consisting of correctness, 
completeness and clean termination. Correctness means that any answer ob- 
tained from the program satisfies its specification. As logic programming is non- 
deterministic, one is interested in completeness, i.e. that all the results required 
by the specification are computed. Programs should also (cleanly) terminate 
— computations should be finite and without run-time errors, like floundering 
and arithmetical exceptions. Obviously, clean termination depends on the oper- 
ational semantics, in particular on the selection rule. However correctness and 
completeness do not. It is desirable that they could be dealt with in a declar- 
ative way, abstracting from any operational semantics and treating programs 
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and their answers only from the logical point of view. This makes it possible to 
separate reasoning about “logic” and “control” . 

In this paper we show how to prove correctness and completeness declara- 
tively. We discuss a known method of proving correctness of definite programs 
and introduce a method for proving completeness. Then we generalize both meth- 
ods to programs with negation. 

The proof method for definite program correctness [Cla79,Hog81,Der93] is 
simple and straightforward. It is declarative: it abstracts from any operational 
semantics. It should be well known. However its usefulness is often not appreci- 
ated. Instead a more complicated approach using operational semantics was pro- 
posed by some authors [BC89,Apt97,PR99]. That approach takes into account 
the form of atoms selected under LD-resolution. We show that the operational 
approach is not stronger than the declarative one, as far as properties of program 
answers are concerned. 

The following observation is important for our approach: one should not re- 
quire using the same specification for both correctness and completeness. This is 
natural, as there usually are answers which are neither considered erroneous nor 
required to be computed. Using the same specification requires making decisions 
like “is append{[],7,7) correct?”; this brings substantial and unnecessary com- 
plications. So there is some 3- valued flavour even in logic programming without 
negation [NaiOO]. 

The paper consists of two main chapters. The first is devoted to definite 
programs, the second to normal programs. In each case we first discuss proving 
correctness, then completeness. A comparison with the operational approach 
to proving correctness follows the section on definite programs correctness. An 
additional section contains example proofs for normal programs. The paper is 
concluded by a section on related work. Unabridged, preliminary version of this 
paper can be found in [DM01]. 

2 Preliminaries 

For basic definitions we refer the reader to [Llo87] and to [Apt97,Doe94] . We con- 
sider the declarative semantics given by 3- valued logical consequence of program 
completion [Kun87] (however our proof methods use only 2- valued logic). This 
is a standard semantics for normal programs with finite failure [Doe94j. It is a 
generalization of the classical semantics for definite programs. SLDNF-resolution 
is sound for it and important completeness results exist. 

By a computed (resp. correct) answer we mean an instance Q9 of a query Q, 
where 0 is a computed (correct) answer substitution for Q and the given program 
(and Q is a sequence of literals). Notice that, by soundness and completeness of 
SLD-resolution, the sets of computed and of correct answers for a given program 
(and arbitrary queries) are equal. So in the case of definite programs we often 
do not distinguish between these two kinds of answers. 
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We represent interpretations as sets: Herbrand interpretations as sets of 
ground atoms, non Herbrand ones as sets of constructs of the form p(. . .), where 
p is a predicate symbol (cf. [Llo87, p. 12], [Doe94, p. 124]). 

3 Reasoning about Definite Programs 

First we discuss correctness. We show a way of proving program correctness 
w.r.t. specifications. In the next section we compare it with an approach related 
to operational semantics and show that it is not weaker. Then we show a way of 
proving completeness. 

3.1 Correctness of Definite Programs 

We begin with a brief discussion on specifications. As a standard example let us 
take the program APPEND: 

app{ [],L,L) ^ 

app{ [H\K],L, [H\M] ) ^ app{ K,L,M) 

We want to prove that it indeed appends lists. We need a precise statement (a 
specification) of this property. A slight complication is that the program does 
not actually define the relation of list concatenation, but its superset; the least 
Herbrand model contains atoms like app{[], 1, 1). This is a common phenomenon 
in logic programming, the least model contains “ill-typed” atoms which are ir- 
relevant for the correctness of the program. 

So we want to prove that: 

For any answer app{k, /, to), if k and I are lists^ then to is a list and k * I = m. 

(By a list we mean a term [ti, . . . ,t„] (in Prolog notation), where n > 0 and 
ti,. . . ,tn are possibly non ground terms. Symbol * denotes the list concatena- 
tion.) This property could be equivalently expressed as 

spec ]= app{k,l,m) (1) 

for any answer app{k,l,m), where spec is the Herbrand interpretation: 

spec = { app{k, /, to) G "H ] if fc and I are lists then to is a list and k * I = m} 

{TL is the Herbrand base; we assume a fixed infinite set of function symbols). 
Obviously, (1) holds iff all the ground instances of app{k,l,m) are in spec. 

Notice that we do not need to refer to the notion of a query in the specifi- 
cation. Assume that app{k,l,m) = app{k' ,1' ,m')9 is a computed answer for a 

^ Actually, the requirement on k is unnecessary. Our intention however is to follow 
the corresponding example from JApt97]. A full specification of APPEND may be: 
if Z is a list or m is a list then k, I, m are lists and k* I = m. 
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query app{k' , I' , m') . If k'J' are lists then {k,l are lists and) the specification 
implies that m is a list and k * I = m. 

Such specifications, referring to program answers, will be called declarative. 
A declarative specification can be an interpretation (possibly a non Herbrand 
one) or a theory.^ In this paper we will use specifications of the first kind, but 
most of our results also apply to specifications of the second kind. 

Definition 3.1. A definite program is correct w.r.t. a declarative specification 
spec iff spec ^ Q for any answer Q of the program. 

Notice that if a program is correct w.r.t. a Herbrand interpretation spec then its 
least Herbrand model is a subset of spec. 

To prove correctness (of a logic program w.r.t. a declarative specification) 
we use an obvious approach, discussed among others by Clark [Cla79], Hogger 
[HogSl, p. 378-9] and Deransart [Der93, Section 3].^ We will call it the natu- 
ral proof method. It consists of showing that spec ^ C for each clause C of 
the considered program. The soundness of the natural method follows from the 
following simple property: 

Proposition 3.2 (Correctness, definite programs). Let P be a program 
and spec be an interpretation. If 



spec ^ P 

then P is correct w.r.t. specification spec. 

Proof. By soundness of SLD-resolution, P |= Q for any answer Q. Now spec ^ P 
and P [= Q imply spec |= Q. (This also holds for spec being a theory.) □ 

The method is also complete [Der93] in the following sense. If a program 
P is correct w.r.t. a declarative specification spec then there exists a stronger 
specification spec' C spec such that spec' ^ P, and thus the method is applicable 
to spec'. (The least model of P over the given domain can be taken as spec'.) 

In our example the correctness proof of APPEND is simple. We present here 
its less trivial part with details. Consider the second clause. To show that 

spec 1= app{[H\K], L,[H\M]) ^ app{K,L,M) 

take ground terms hjkJjVr^ such that spec ^ app{k,l,m) (in other words 
app{k,l,m) G spec). We have to show that spec ^ app{[h\k],l,[h\rn]). Assume 
that [h\k] and I are lists (hence fc is a list). Then m is a list and k * I = 
m, as spec |= app{k,l,m). Thus [h\m] is a list and [h\k] * I = [/ijm], hence 
app{[h\k],l, [/ijm]) G spec. This concludes the proof. 

^ A specification equivalent to our example specification spec may consist of an axiom 
app(k,l,m) [list(k),list{l) — >■ list{m),k*l=m) together with axioms describing 
predicates =, list and function *, and an induction schema for fists. 

® where it is called “inductive proof method” . 

^ and valuation { i7/h, A/fc, L/l, M/m } 
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Programs dealing with accumulators or difference lists are sometimes con- 
sidered difficult to reason about. The following example shows that this is not 
the case when the natural method is used. Consider the standard REVERSE 
program: 

reverse{X , Y) ^ rev{X, V, []) 
rev{[\,X,X) ^ 

rev{[H\L],X,Y) ^ rev{L,X, [H\Y]) 

Formally, a difference list representing a list [ti, , t„] is any pair {[ti, . . . ,tn\t], t) 
of terms. The declarative reading of the program is simple: the first argument of 
rev is a list, its reverse is represented as a difference list of the second and the 
third argument. This can be expressed by a formal specification 

spec = {rever.se{[ti , . . . ,t„], [t„, . . . ,ti]) | n > 0, ti, . . . ,t„ G T} 

U {rev{[ti , . . . ,t„], [t„, . . I n > 0, ti, . . . , t„, t G T} 

where T is the set of ground terms. The reader can check that spec ^ REVERSE, 
thus the program is correct w.r.t. spec. 

Notice that the natural method refers only to the declarative semantics of 
programs. A specification is an interpretation (alternatively a theory). Correct- 
ness is expressed as truth (of the program’s answers) in the interpretation. Pro- 
gram clauses are treated as logic formulae, their truth in the interpretation is 
to be shown. We abstract from any operational semantics, in particular from 
the form of queries appearing during computation. The reasoning is obviously 
independent from the selection rule. Still we can use declarative specifications to 
reason about queries and corresponding answers, using the fact that an answer 
is an instance of the query. 

3.2 Call-Success Specifications and the Operational Approach 

Some authors propose another approach to proving correctness of definite pro- 
grams [BC89], [Apt97, Chapter 8], [PR99].® We will call it an operational proof 
method. In this section we show that it is not stronger than the natural method 
from the previous section (as far as properties of program answers are concerned). 

A specification in the operational approach consists of two parts. The precon- 
dition specifies the procedure calls that appear during the computation (more 
precisely, the atoms that are selected in the LD-resolution). The postcondition 
specifies the procedure successes (the computed instances of procedure calls). 
We will call such specifications call-success specifications. Formally, pre- and 
postconditions are sets of atoms, closed under substitutions. 

A program is correct if every procedure call and every success satisfy the pre- 
or postcondition respectively, provided that the initial query satisfies a certain 
condition. Notice that this is not a declarative property. It considers not only 
computed answers, but whole computations (LD-trees), and it depends on the 
selection rule used. 

® Whenever these approaches differ, we follow that of [Apt97]. 
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The operational proof method was proposed by Bossi and Cocco [BC89] and 
is an instance of the method of Drabent and Maluszyhski [DM88] . It is based 
on the following verification condition. For each clause C of the program, show 
that for each (possibly non ground) instance H ^ Bi,. . . ,Bn (n > 0) of C: 

if H G pre, Bj, . . . ^ Bk G post then Bk+i G pre (for k = 0, . . . , n—1), 

if H G pre, , . . . , G post then H G post. 

The condition on the initial query is that, for any instance Bi, (n > 0) 

of the query, if Bi,. . . ,B^. G post then G pre (for k = 0, . . . ,n — 1). 

Let us come back to our APPEND example. We refer here to its treatment 
in [Apt97, p. 214]. The precondition and postcondition are, respectively, 

pre = { app{k, l,m) \ k and I are lists }, 

post = { app{k, I, m) \ k,l,m are lists and k * I = m}. 

(Here k,l,m are terms, possibly non ground.) The verification conditions to 
be proved consist of one implication for the first clause of APPEND and two 
implications for the second one. The details of the proof can be found in [Apt97]. 

Notice that the operational method requires proving one implication per 
atom occurring in the program or in the initial query. In contrast, the natural 
method from the previous section requires proving one implication per program 
clause. The natural method is independent of the operational semantics and of 
the ordering of the body atoms of program clauses. 

The natural method is an instance of the operational one. (For a given declar- 
ative specification spec from the natural method, take the set of all atoms as the 
precondition and post = {A j spec ^ A} as the postcondition). We show however 
that both methods are equivalent, as far as properties of answers are of inter- 
est. Consider a call-success specification {pre, post). A corresponding declarative 
specification could be seen, speaking informally, as implication pre — >■ post. 

Definition 3.3. Let pre and post be sets of atoms closed under substitu- 
tion. The declarative specification corresponding to the call-success specification 
{pre, post) is the Herbrand interpretation 

pre^post := { A G "H ] if A G pre then A G post }. 

In other words, pre^post = ("H \ pre) U {H C\ post). If P is correct w.r.t. 
pre-^post and A9 is an answer to a query A G pre then A9 G post. 

The following proposition compares corresponding declarative and call-success 
specifications. Proposition 3.5 compares both proof methods, showing that the 
natural method is not weaker. 

Proposition 3.4. If a program P is correct w.r.t. the call-success specification 
{pre, post) then P is correct w.r.t. declarative specification pre-G-post. 

Proposition 3.5. If P and {pre, post) satisfy the verification condition of the 
operational method then pre^post ]= P. 
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In other words, assume that by the operational method it can be shown that 
a program P is correct w.r.t. {pre, post) . Then it can be shown that P is correct 
w.r.t. pre^post, using the natural method. 

For proofs of both propositions see [Dra99] (and [CD88] for the second one) . 
The first property is mentioned also in [dB+97,PR99]. The reverse of the two 
propositions does not hold. For a counterexample consider reordering the body 
atoms in a correct program. For further comparisons see [Dra99]. 

Notice the difference in the treatment of “ill-typed” atoms (like app{[], 1, 1) 
for APPEND) by the specifications in both methods. Declarative specifications 
include all such atoms, call-success specifications exclude them. 



3.3 Completeness of Definite Programs 

Let us begin from an observation that for a given program a specification for 
completeness is in general different from that for correctness. For the purposes 
of correctness we describe a superset of the set of answers of a program. For 
the purposes of completeness we describe its subset, as a program satisfying a 
completeness requirement may compute something more than required. Often 
when a specification for correctness is of the form pre^post then a specification 
for completeness is post. 



specification for completeness 



required 



incorrect 



specification for correctness 



For instance, it makes no sense to require that APPEND program were com- 
plete w.r.t. the specification spec from the beginning of Section 3.1. This would 
mean computing all the “ill-typed” answers, like app{a,b,c). Our specification 
for completeness of APPEND is the Herbrand interpretation 



specC = {app{k, l,m) € H \ k,l,m are lists, k * I = m}. 

Notice that it properly expresses our intentions: APPEND should compute all 
the cases of list concatenation. The difference spec — specC contains only “ill- 
typed” atoms, with the first or second argument not being a list. We are not 
interested whether they are answers of APPEND. 

As previously, we consider specifications which are (possibly non Herbrand) 
interpret at ions . 

Definition 3.6. A program P is complete for a query Q w.r.t. a specification 
specC if specC ^ Q9 implies that Q9 is an answer for the program, for any 
instance Q9 of Q. 
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Notice that an answer QO is an instance of some computed answer for Q. 

Below we refer to theory ONLY-IF(P) [Apt90] that is usually used while 
defining Clark completion comp(P) of a program P. Informally, ONLY-IF(P) is 
P with implications reversed. For each predicate symbol p, if the clauses of P 
beginning with p are p{ti)-i^Bi, . . . ,p{tk)-^Bk then ONLY-IF(P) contains 

k 

p{x) \f 3_s x = ti A Bi, 

i=l 

where x are distinct new variables and the quantification is over the variables 
occurring in the clauses. For k = 0 the implication is equivalent to -'p(x). In our 
example, ONLY-IF(APPEND) is (equivalent to) 

app{x, y, z) -A X =[], y = z V 3h,k,m{x = [h\k], z = app{k, y, m)). 

We will also need a specification for equality: spec^ = {t=t | t is a ground term}. 

The following property can be used to prove completeness of a program. It 
is a special case of Theorem 4.6 for normal programs. 

Proposition 3.7 (Completeness, definite programs). Let P be a program 
and Q a query. Assume that 

(i) specC U spec^ |= ONLY-IF(P) and 

(ii) P terminates for Q, i.e. there exists a finite SLD-tree for Q and P. 

Then P is complete for Q. 

For instance, consider program APPEND and the specification specC given 
above. It is easy to show that specC U spec^ |= ONLY-IF(APPEND). Consider 
Q = app{k, l,m), where m is a list. One can show, using any standard method, 
that Q terminates under Prolog selection rule. Thus APPEND is complete for Q. 

Now assume that k, I are distinct variables. Taking k' , I' being lists such that 
k' * I' = m we have specC ^ app{k',l',m) and from the proposition we get 
P \= app{k' , I' , m) . So by completeness of SLD-resolution, app{k',l',m) (or a 
more general atom) is a computed answer for Q. Summarizing, Q succeeds and 
produces all the required divisions of m into two lists. 

We believe that the proposition above is a formalization of a common way 
of informal reasoning about completeness, which consists of checking that any 
tuple of argument values to be defined by the predicate is “covered” by some of 
its clauses. 

The method proposed here proves program completeness for queries that ter- 
minate. This should not be seen as disadvantage, since termination of a program 
has to be established anyway. Notice that the proposition without the termina- 
tion requirement does not hold. Program { app{X, Y, Z) ^ app{X, Y, Z) } is a 
counterexample . 
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4 Reasoning about Normal Programs 

In this chapter, unless stated otherwise, the considered programs (queries) are 
normal programs (queries). As the operational semantics we consider SLDNF- 
resolution [AD 94]. We usually skip “finitely” in phrases like “finitely fails”. 

In order to introduce specifications for normal programs let us first consider 
definite programs with queries which may contain negative literals. Assume that 
we have a definite program P complete w.r.t. a Herbrand specification specC and 
correct w.r.t. a specS (C as completeness, S as soundness). If an atomic query 
A fails then specC ^ -<A. So for P and atomic queries, finite failure is correct 
w.r.t. the specification for completeness. Now consider a query Q = p{i^,^q{u). 
If it succeeds with an answer QO then spec' ^ QO for an interpretation spec' 
that interprets p as specS and q as specC. If Q fails then spec" ^ -'Q for an 
interpretation spec" that interprets p as specC and q as specS. In order to deal 
with this phenomenon, we will use certain renamings of predicate symbols. 

Definition 4.1. Let £ be a first order language. Let Q be a formula or a set of 
formulae (e.g. a query or a program) of C. Let us extend C by adding, for any 
predicate symbol p, a new predicate symbol p' . 

0! is Q with p replaced by p' in every negative literal of Q (for any predicate 
symbol p, except for =). Similarly, Q" is Q with p replaced by p' in every positive 
literal. 

If / is an interpretation for C then I' is the interpretation obtained from / 
by replacing each phy p' . 

For normal programs, a specification for correctness should describe two pos- 
sibly overlapping sets of ground atoms, those allowed to succeed and those al- 
lowed to fail. Similarly, a specification for completeness should describe two 
disjoint sets, of the ground atoms required to fail and of those required to suc- 
ceed. It is natural to allow to succeed any atom not required to fail, and allow 
to fail any atom not required to succeed. Hence the two sets needed to specify 
completeness can be the complements of the two sets used to specify correctness. 

Definition 4.2. A specification for a normal program is a pair (specS, specC), 
where specC and specS are interpretations such that specC C specS. 

For an informal explanation, assume that a program P is correct w.r.t. a 
Herbrand specification spec = {specS, specC). If a ground atomic query A suc- 
ceeds then A G specS. If A fails then A ^ specC. If P is complete w.r.t. spec 
then any A G specC succeeds and any A ^ specS fails. 

4.1 Correctness of Normal Programs 

Definition 4.3. We say that a program P is correct with respect to a specifi- 
cation spec = {specS, specC) iff for any query Q 

(i) every computed answer Q satisfies: specS U specC |= Q' , 

(ii) if Q finitely fails then specS U specC ^ -•Q" . 
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In particular, if P is correct with respect to spec = {specS , specC) , then 
every computed answer Q satisfies the following. For each positive literal A in 
Q, specS 1= A, and for each negative literal ~<A in Q, specC ^ ~<A. 

Theorem 4.4 (Correctness, normal programs). Let P be a program, Q a 
query and spec = {specS, specC) a specification, such that 

(a) specS U specC ^ P' 

(b) specS U specC U spec= \= ONLY-IF(P") 

then 

(i) if comp{P) 1=3 Q, then specS U specC |= Q' 

(ii) if comp{P) ^3 -'Q, then specS U specC |= -'Q” 
and P is correct w.r.t. spec. 

Proof (outline). The proof [DM01] is based on (1) similarity between P' U 
ONLY-IF(P") U CET and Stark’s partial completion pcomp{P) [Sta96], (2) 
equivalence of 3- valued consequences of comp{P) and classical consequences of 
pcomp{P) (modulo a certain syntactic transformation) [Sta96], and (3) sound- 
ness of SLDNF-resolution w.r.t. 3- valued completion semantics [Doe94]. □ 

The Theorem 4.4 is valid for any operational semantics, which is sound w.r.t. 
3- valued completion semantics. This includes constructive negation (cf. [Dra95] 
and the references therein) and extensions of SLDNF-resolution allowing select- 
ing non ground negative literals under certain conditions [Llo87,Sta96] . 

4.2 Completeness of Normal Programs 

To discuss completeness we need to refer to the notion of SLDNF-tree. We will 
follow the definition of Apt and Doets [AD94]. We outline it below, for the details 
the reader is referred to [AD94] or [Doe94j. 

An SLDNF-tree for query Q and program P is a set of trees, with one of 
them distinguished as the main tree. The nodes of the trees are queries and the 
trees are, roughly speaking, SLDNF-trees of [Llo87]. Q is the root of the main 
tree. Any node with a non ground negative literal selected is a leaf of a tree, 
such a node is marked floundered. Whenever a ground negative literal -'A is 
selected in a node N then there exists a subsidiary tree with the root A. The 
whole SLDNF-tree may be viewed as a tree of trees, in which the tree with the 
node N is the parent of the subsidiary tree with the root A. 

The leaves of each tree can be marked failed or success, with the expected 
meaning. So if a leaf N is neither marked failed nor success then a negative 
literal -'A is selected in N, moreover A is non ground or the subsidiary tree for 
A neither succeeds nor finitely fails. A tree succeeds if it has a success leaf. A 
tree finitely fails if it is finite and all its leaves are marked failed. 

The SLDNF-tree succeeds (finitely fails) if the main tree does. To each success 
leaf of the main tree there corresponds a computed answer substitution 6 for Q 
(and a computed answer Q9), defined as expected. 

Definition 4.5. We say that a program P is complete for a query Q w.r.t. a 
specification spec = {specS, specC) iff 
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(i) specS U specC ^ ~^Q' implies that some SLDNF-tree for Q finitely fails, 

(ii) specS U specC |= Q"a implies that some SLDNF-tree for Q succeeds with 
an answer Q9 more general than Qa. 

Theorem 4.6 (Completeness, normal programs). Assume that the set 
of function symbols is infinite. Let P be a program, Q a query and spec = 
{specS, specC) a specification such that 

1. (a) specS U specC |= P' , (b) specS U specC U spec^ |= ONLY-IF(P"), 

2. there exists an SLDNF-tree for Q such that its main tree is finite and all the 
leaves of the main tree are marked failed or success. 

Then P is complete for Q w.r.t. spec = {specS, specC). 

Conditions (a),(b) are the same as in Theorem 4.4. Condition 2 implies that 
each -lA selected in the main tree is ground and the subsidiary tree for A succeeds 
or fails. Notice that the SLDNF-tree may be infinite or contain floundering nodes. 
However the “important part” of it is finite and without floundering and can be 
computed under some search strategy in a finite number of steps. (When a 
success is obtained in a subsidiary tree, traversing this tree can be abandoned.) 

Proof (outline). By Theorem 4.4 program P is correct w.r.t. spec. The existence 
of SLDNF-trees satisfying the thesis is proved by contradiction [DM01]. □ 

4.3 Examples 

In this section we illustrate our method of proving correctness and completeness 
of normal programs by two examples. The first one is a small program defining 
the subset relation. We have chosen this example (from [Sta96]) because it is 
short and it has nested negations. Completeness of normal programs can however 
be better illustrated by the second example defining the subset relation with an 
additional requirement that a subset must be a list without repetitions. 

Example 4.1. Let P be the following program: 

subset{L,M) ^ ->notsubset{L, M) 

notsubset{L, M) member {X, L), ->member{X , M) 

member {X, [AjP]) ^ 

member {X, [T|L]) ^ member {X, L) 

Consider Her brand specification spec = {specS, specC), where 

specS = sSm U sSn U sSs, specC = sCm U sCn U sCs 
sSm = {member{x, I) | Hs a list x € 1} 
sCm = {member{x, I) | Hs a list A x € 1} 
sSn = {notsubset{l,m) \ I and m are lists -A I 2 w} 
sC„ = {notsubset{l, m) \ I and m are lists A I 2 w} 
sSs = {subset{l,m) \ I and m are lists -A I C m} 
sCs = {subset{l, m)| I / and m are lists A I C m} 
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We would like to prove that our program is correct and complete with respect 
to the above specification spec. We show that conditions (a) and (b) of Theorem 
4.4 are satisfied. This implies that whenever subset{l,m) is a computed answer 
of P then sSs |= subset{l,m), and if l,m are lists then I C m. Also, whenever a 
query subset{l,m) fails then sCs |= -'subset{l,m). (Notice that /,m are ground, 
as otherwise the query flounders.) Hence I or m is not a list or I % m. From 
Theorem 4.6 it follows that, for ground lists l,m, ii I C m then subset{l,m) 
succeeds, otherwise it fails, since under Prolog selection rule any ground query 
subset{l, m) terminates without floundering. 

Let spec' = specS U specC'. In order to prove condition (a), for each clause 
C of P one has to show that spec' ^ C". In order to prove condition (b) one 
has to show, for each predicate p of P, that the implication of ONLY-IF(P") 
beginning from p is true in the interpretation spec' U spec= . 

Let us consider the second clause of program P. For condition (a) we have 
to prove that: 

spec' ^ notsubset{L, M) ^ member{X, L) A -imember'{X, M). 

Let I, m, X be any elements of the universe such that spec' |= member{x, 1) A 
-<member' (x , m) . That means that member{x,l) € sSm and member{x,m) ^ 
sCm- We would like to prove that notsubset{l,m) G sS'„. So assume that I 
and TO are lists. From member{x, 1) G sSm we obtain that x G I, and from 
member{x, to) ^ sCm — a; ^ to. Hence I 2 w. 

For condition (b) and predicate notsubset we have to show that 

spec' 1= notsubset' {L, M) — >■ 3X {member' {X , L) A ->member{X, M)) 

Let I, TO be any elements of the universe such that spec' ^ notsubset' {I, m). 
So I and to are lists and / ^ to. So there exists an element, say a, such that 
a G I and a ^ m. Thus member{a,l) G sC„i and member{a,m) ^ sSm- Hence 
spec' 1= member' {a, 1) A ~<member{a,m) and spec' ^ 3X {member' {X, L) A 
-'member {X, M)), so the implication above is true in spec'. 

Let C denote the first clause of P. Notice that subset{L, M) ga 
~'notsubset{L, M) is true both in sSg U sC'n and in sCg U sS'„. After replac- 
ing notsubset by notsubset' , this implies sSs U sC'^ ^ C , and hence (a) for 
the first clause. After replacing subset by subset' , this implies sS'„ U sC( ^ 
subset' {L, M) — >■ ~'notsubset{L, M), hence (b) for predicate subset. 

The proof for predicate member boils down to a correctness and completeness 
proof of a definite program. □ 



Example 4.2. Let P be the following program: 
subs{[],L) G- 

.sub.s{[H\T], LH) G- select{H, LH, L), .subs{T, L), -'member {H, T) 
select{H, [H\L],L) G- 

select{H, [X\L], [X\LH]) G- select{H, L, LH) 
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The definition and specification of member are the same as in Example 4.1. A 
Herbrand specification for P is spec = {specS, specC), where 

SpecS — sS’iui U sSsel sSsubs^ SpecC — sCjYi U sCsel sCsubs 

sSsei = {select{e, l,m) \ I is a, list — >■ e G ? A m is a list A I « [e|m]} 
sCsei = {select{e,l,m) \ I and m are lists such that 

I — [ci , . . . , Cj , e, Cj-i-i , . . . c/e] 7 m — [ci , . . . , e^ , Cj+i , . . . e^] , 0 ^ i ^ 
sSaubs = {suhs{l,m) I m is a list — >■ listd{l) A I C m} 
sCsubs = {subs{l, to) I TO is a list A listd{l) A I C to} 

Here I « to means that lists I and to contain the same elements and listd{l) 
means that I is a list with distinct elements. 

Let spec' = specS U specC. To prove condition (a) for the second clause of 
predicate subs/2, assume that 

spec' \= select{h,lh,l),subs{t,l),->member'{h,t). (A) 

We show that subs{[h\t],lh) G sSgubs- So let Ih be a list. From (A) it follows 
that: 

(1) select{h,lh,l) G sSsei hence h G Ih and I is a list such that Ih « [h\l] (since 
Ih is a list); 

(2) subs{t,l) G sSsubs hence listd{t) and t C I, thus [h\t] C Ih, by (1); 

(3) member{h,t) ^ sCm hence h ^ t (since t is a list), thus listd{[h\t]) , by (2). 
We obtained [h\t] C Ih and listd{[h\t]) , this completes the proof of condition (a) 
for the most complex clause of P. 

The remaining part of the proof of condition (a) and proof of condition (b) 
is skipped here. It follows that P is correct w.r.t. spec. 

Consider a query Q = subs{L, M), where L is a variable and M a ground 
list. Once it is shown that for such queries P terminates without floundering 
(under some selection rule and search strategy), it follows that P is complete 
for such queries. This means that for a given list all its subsets (and all their 
permutations) will be computed. 

Assume that we do not have a termination proof and request all answers 
to a query Q from an interpreter with run-time checks for floundering. Then if 
the execution terminates, we know that all the answers for Q required by the 
specification have been produced. This happens in the case of P and Prolog. □ 

Every (logic) programmer should have, at least in her mind, intended in- 
terpretations of all used relations. Specification .spec is a formalization of such 
interpretations. We believe that the methods advocated in this paper are a for- 
malization of informal reasoning performed by a competent programmer to con- 
vince herself about correctness of a program. 

5 Related Work 

In this section we present a brief overview of related work. A more detailed 
comparison with the work on definite programs can be found in [Dra99]. 
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Due to our approach to specifications, we do not need any explicit notion of 
precondition, type information, or domain of a procedure. Such notions are used 
in most other approaches [BC89,Apt97,PR99,Dev90] to deal with “ill-typed” 
atoms, for which the behaviour of the program is of no interest. Our approach is 
related to a 3- valued approach to definite programs [NaiOO]. We however avoid 
any use of 3-valued logic. [Nai96] advocated declarative view for a class of pro- 
gram properties. 

A completeness theorem for definite programs is given in [DM93] . It is stronger 
than ours, as its premises do not refer to termination. It however requires check- 
ing some other conditions, that checking is similar to proving termination. So 
whenever termination has to be shown anyway, our approach is simpler. 

Comparison with the operational method [BC89,Apt97,PR99] for correctness 
of definite programs has been given in Section 3.2. We showed that the natural 
method of Section 3.1 is not weaker. Some inconveniences of the operational 
approach are discussed in [Dra99]. The operational method can be generalized 
to correctness of normal programs [Apt93,PR99j. Here the comparison is simi- 
lar. The operational approach refers to LDNF-resolution, while the declarative 
method of Section 4.1 is independent from the operational semantics. So it cov- 
ers arbitrary selection rules (e.g. delays used to avoid floundering) and various 
generalizations of SLDNF-resolution. We expect that reasoning in Section 3.2 
can be generalized to this case thus showing that the operational method is not 
stronger (as far as properties of program answers are concerned). 

Our approach to normal programs considers their 3-valued semantics, which 
is more precise than 2-valued, used in [Dev90,Apt93,PR99]. Further comparisons 
are needed with [Dev90] and with completeness realated reasoning in [PR99]. 

An important approach to proving properties of normal programs is proposed 
by Stark [Sta97]. It deals with normal programs, executed under Prolog selec- 
tion rule. A tool to mechanically verify the proofs exists. Success, failure and 
termination are described by an inductive theory. The theory can be seen as a 
further development of the notion of program completion. The program’s prop- 
erties of interest are expressed as formulae and one has to prove that they are 
consequences of the theory. This is opposite to our approach where properties 
are expressed as specifications and one has to prove that certain theory obtained 
from the program is a consequence of the specification. 



6 Conclusions 

This paper advocates declarative reasoning about logic programs. We show how 
to prove correctness and completeness of definite and normal logic programs in 
a declarative way, independently from the operational semantics. This makes 
it possible to separate reasoning about “logic” from reasoning about “control” . 
The method for proving correctness of definite programs is not new, however 
its usefulness has not been appreciated. The methods for completeness and for 
correctness of normal programs are a contribution of this work. 
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For definite program we refer to two specifications; one for correctness and 
one for completeness. This makes it possible to specify the program semantics 
approximately, thus simplifying the specifications and the proofs. In this paper 
specifications are interpretations, but the approach seems applicable to specifi- 
cations being theories. 

The semantics of programs with negation is 3- valued. We do not however 
explicitly refer to 3- valued logic. Instead, a pair of specifications plays a role of a 
3- valued specification. In this case it turns out that the same pair of specifications 
can conveniently be used both for correctness and completeness. 

We believe that the presented proof methods are simple and natural. We 
claim that they are a formalization of a style of thinking in which a competent 
logic programmer reasons (or should reason) about her programs. We believe 
that these methods, possibly treated informally, are a valuable tool for actual 
everyday reasoning about real programs. We believe that they should be used 
in teaching logic programming. 
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Abstract. We usually use natural language vocabulary for sort names in 
order-sorted logics, and some sort names may contradict other sort names 
in the sort-hierarchy. These implicit negations, called lexical negations in 
linguistics, are not explicitly prefixed by the negation connective. In this 
paper, we propose the notions of structured sorts, sort relations, and the 
contradiction in the sort-hierarchy. These notions specify the properties 
of these implicit negations and the classical negation, and thus, we can 
declare the exclusivity and the totality between two sorts, one of which 
is affirmative while the other is negative. We regard the negative affix as 
a strong negation operator, and the negative lexicon as an antonymous 
sort that is exclusive to its counterpart in the hierarchy. In order to infer 
from these negations, we integrate a structured sort constraint system 
into a clausal inference system. 



1 Introduction 

Order-sorted logics, or many-sorted logics, have been well discussed as tools to 
represent hierarchical knowledge in the field of artificial intelligence [18,4,3,7,14], 
[16,19]. Recently, description logics [15,1,2,10] as outlined in [6] have been stud- 
ied as a theoretical approach to terminological knowledge representation, which 
represent structured concepts by more primitive concepts, as are similar to those 
in a sort-hierarchy. 

However, a sort-hierarchy may contain sorts with implicitly negative mean- 
ings. These negations are called lexical negations in linguistics and are distinct 
from the negative particle not. Since every sort name is a mere string or a sym- 
bol, these implicitly negative sorts are not interpreted to represent their original 
meanings. Nevertheless, a knowledge representation system should take account 
of the fact that the lexical negation ^unhappy' is opposite in meaning to the 
positive expression ^happy\ or ^winner’’ contradicts Hoser\ in a sort-hierarchy. 

In order to realize this, we have to analyze the properties of lexical negations 
in natural language and then consider dealing with these negations in a sort- 
hierarchy. In [12], lexical negations (words that implicitly have negative meaning) 
are classified as follows: 
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(i) Negative affix (in-,un-,non-): 

incoherent, inactive, unfix, nonselfish, illogical, impolite, etc. 

(ii) Lexicon with negative meaning: 

doubt (believe not), deny (approve not), prohibit (permit not), forget (re- 
member not), etc. 

First, we introduce a hybrid knowledge representation system of Beierle [3] 
that distinguishes between taxonomical information (in the sort-hierarchy) and 
assertional information (in the assertional knowledge base), as an extension of an 
order-sorted logic. This system can deal with the taxonomical information in an 
assertional knowledge base in which a sort symbol can be expressed as a unary 
predicate (called a sort predicate) in clausal forms. Since a sort and a unary 
predicate have the same expressive power, we can regard a subsort declaration 
Si C S2 as the following logical implication form: 



si(a;) -)> S 2 {x) 



where the unary predicates si(x),S2(a;) corresponding to the sorts si,S2 are 
sort predicates. Let C,C\,C2 be clauses, s,si,S2 sorts (or sort predicates), 6 a 
sorted substitution, and t,t\,t2 sorted terms. In order to use the information in 
a sort-hierarchy in a clausal knowledge base, or an assertional knowledge base, 
the following inference rules: 



-'Si(ti) V Cl 52(^2) V C2 
0(Ci V C2) 



{subsort resolution) 



where S2 Es si and 9 {ti) = 0{t2), and 



~'s{t) V C 
C 



{elimination) 



where Sortft)^ Es s, are added to his resolution system. This hybrid knowledge 
representation system provides a useful way to deal with a sort-hierarchy in a 
clausal knowledge base. 

Hereafter, we illustrate the deductions which we would like to realize in a 
sort-hierarchy with lexical negations. The first example concerns a negative affix: 
unhappy. A sort unhappy is not only a negative expression but also a subex- 
pression of emotional. Hence, the sort emotional can be derived from unhappy 
(like happy), whereas it cannot be derived from the classical negation ->happy. 
In addition, the sort unhappy is a stronger negative statement than the classical 
negation -'happy, so that -•happy can be derived from unhappy, but unhappy 
cannot be derived from -•happy. The fact -•emotional {bob), that is the person 
bob is not emotional, yields that ^-•happy{bob) A -•unhappy {bob).’ In contrast, no 
premise can derive ^-•happy{bob) A -•-•happy{bob) .’ This shows the sort unhappy 
has the meaning of emotional, but the classical negation -•happy does not have 
the meaning of emotional. 

For any sorted term t, the function Sort{t) assigns its sort to term t. 
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The next example concerns lexicon with negative meaning: loser. Suppose a 
sort-hierarchy where both of winner and loser are subsumed by player. Needless 
to say, loser is different from the classical negation -^winner of winner, because 
loser means the negative event opposite to an event denoted by win but the 
classical negation —^winner denies the event denoted by win. Therefore, the 
supersort player can be derived from loser (or winner), but not from -•winner. 
Furthermore, if the person tom is not a player, then the negations —•winner and 
-•loser can be derived. In contrary, if the person tom is a player, then winner 
or loser holds in the totality (i.e. tom must be a winner or a loser) of winner 
and loser. By the totality, if the person tom is a player but not a loser (-•loser), 
then tom is a winner. If tom is neither a winner nor a loser (-•winner A -•loser), 
then tom is not a player (-•player) 

We would like to derive these facts from implicitly negative sorts. However, 
it is hard to describe implicitly negative sorts in the sort-hierarchy, so that many 
knowledge bases would lose the property that implicit negations are exclusive 
to their antonyms and partial to their classical negation. In fact, Beierle’s infer- 
ence system for sort-hierarchy and order-sorted substitutions in clauses do not 
generate any reasoning mechanism for negative sorts. Description logics and fea- 
ture logics [16] provide complex sort expressions but not any clausal reasoning 
mechanism with these expressions. Therefore, these inference systems with sort- 
hierarchy cannot immediately derive the above results from subsorts, supersorts 
and classical negation. In the following sections, we will propose a method to 
describe the properties of lexical negations implicitly included in a sort-hierarchy 
and develop an inference machinery. 

This paper is organized as follows. In Section 2 presents an order-sorted logic 
that includes the complex sort expressions of implicit negations. We give an 
account of structured sorts, sort relations, and contradiction in a sort-hierarchy. 
Section 3 and Section 4 present the formalization of order-sorted logic with 
structured sorts, and systems of clausal resolution. In Section 5, we give our 
conclusions and discuss future work. 



2 Implicitly Negative Sorts 

In order to deal with implicitly negative sorts in a sort-hierarchy, we introduce 
structured sorts, sort relations, and contradiction in a sort-hierarchy into an 
order-sorted logic. These notions can be used to declare the properties of implic- 
itly negative sorts in a sort-hierarchy. 



2.1 Structured Sorts and Sort Relations 

We consider the representation of sorts in a hierarchy whose names are declared 
as lexical negations (classified as negative affixes or lexicons with negative mean- 
ing). In this paper, we call a sort denoted by a word with negative affix a negative 
sort and a sort denoted by a lexicon with negative meaning an opposite sort. In 
general, we call these sorts implicitly negative sorts. To represent these negative 
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sorts, we introduce the notation of structured sorts and relations between sorts 
whereby a negative sort is defined by the structured sort with strong negation 
operator [17] and an opposite sort is defined by exclusivity. In particular, we 
denote an opposite sort as exclusive to its antonymous sort in a hierarchy, so 
that these two sorts exclude each other but neither sort is negative. In fact, we 
should not say that an opposite sort is negative, rather we should say that these 
two sorts are opposite in meaning. 

Structured sorts are constructed from atomic sorts, the connectives n, U, and 
the negative operators; happy is a complement (classical negation) of happy, and 
chappy is a negative sort (strong negation) of that. 

We now give several relations between structured sorts in order to represent 
implicitly negative sorts embedded in a sort-hierarchy. ‘Cg’ denotes a subsort 
relation between structured sorts. With this relation, a set of sorts are partially 
ordered (i.e. reflexive, anti-symmetric, and transitive). ‘=g’ denotes an equiv- 
alence relation between structured sorts. Furthermore, we add an exclusivity 
relation ‘||’ and a totality relation between structured sorts; if s || s' then s 
and s' are exclusive, and if s js. s' then s together with s' composes the whole 
of Si. 

Using these sort relations, we can define the following properties (totality, 
partiality, and exclusivity) to declare various negations (in particular, lexical 
negations), as in Table 1. 



Table 1. Three negations 



Negation type 


Expression Relationship 




Property 


(1) Complement 


happy 


happy |t happy 


(in Axioms) 


totality 


(classical negation) 




happy II happy 




exclnsivity 


(2) Negative sort 


chappy 


chappy II happy 


(in Axioms) 


exclnsivity 


(strong negation) 




chappy Cg happy 




partiality 


(3) Opposite sort 


sad 


sad II happy 


(in Declarations) exclnsivity 


(antonym) 











2.2 A Contradiction in a Sort-Hierarchy 

We present a contradiction in a sort-hierarchy containing the three negations 
(complement, negative sort, and opposite sort) that we have explained. 

A deductive system with implicitly negative sorts has to determine a con- 
tradiction in a sort-hierarchy in order that it can provide a sound inference 
mechanism derived from the three negations and their relations to each other. 
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In classical logic, we can say that a set A of formulas is contradictory if a for- 
mula A and its classical negation -<A are simultaneously derivable from A. In 
this case, we can syntactically establish the contradiction, because ~<A indicates 
the negation of A by the negative operator Given the opposite sorts s and s' 
(e.g. winner and loser), we should also say that A is contradictory if the two 
formulas s{x),s'{x) denoted by the sort predicates s and s' are simultaneously 
derivable from A. This indicates that the sort symbols s and s' have a negative 
relation to each other in our language definition. ^ 

Using an exclusivity relation between sorts, we give a definition of contra- 
dictions in a sort-hierarchy that supports deduction from the three negations. A 
set A of formulas is said to be contradictory if there exist sorts s, s' such that 
s II s' and s{t) and s'{t) are derivable from A. In section 3, we will redefine the 
notion of contradiction in a sort-hierarchy that enables our deduction system to 
ensure the consistency of a knowledge base. 

3 An Order-Sorted Logic with Structured Sorts 

On the specification we propose in Section 2, we define the syntax and semantics 
of an order-sorted logic with structured sorts. 



3.1 Structured Sort Signature 

Given a set S of sort symbols, every sort Si G S is called an atomic sort. We 
define the set of structured sorts composed by the atomic sorts, the connectives, 
and the negative operators as follows. 

Definition 1 (Structured sorts). Given a set S of atomic sorts, the set 
of structured sorts is defined by: 

(1) If s G S, then s G S^, 

(2) If s, s' G S'^, then (s □ s'), (s U s'), (s), (~s) G 

The structured sort s is called the classical negation of sort s and the structured 
sort ~s is called the strong negation of sort s. For convenience, we can denote 
s n s', s U s', s and ~s without parentheses when there is no confusion. 

Example 1. Given the atomic sorts male, student, person and happy, we can 
give structured sorts as follows. 

student □ male, personU chappy. 

The structured sort student n male means “students that are not male,” and 
the structured sort personU ~ happy means “individuals that are persons or 
unhappy.” 

^ Gabbay and Hunter introduce the notation -lafl that means ‘a negates /3,’ concerning 
the contradictory sorts [8]. 
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We define a sorted signature on the set iS”*" of structured sorts. Tn is a set of 
n-ary function symbols (/, /o, /i, . . •), and Vn is a set of n-ary predicate symbols 
{p,Po,Pi , . . .). Let S = {si, . . . , s„} be a set of atomic sorts. We introduce the 
sort predicates p^-^ ,■■■ , Ps^ (discussed in [3]) indexed by the sorts Si , . . . , where 
Ps^ is a unary predicate (i.e. Ps^ G Vi) and equivalent to the sort Sj. We simply 
write s for pa when this will not cause confusion. For example, instead of the 
formula Ps{t) where t is a term, we use the notation s{t). We denote by Vs the 
set {ps G V\ I s G 5 — {T,_L}} of the sort predicates indexed by all sorts in 
S — {T,_L}. A sorted signature extended to include structured sorts and sort 
predicates is defined as follows. 

Definition 2 (Sorted signature on 5+). A sorted signature on , which 
we call a structured sort signature, is an ordered quadruple A+ = {S^,iF,V, f2) 
satisfying the following conditions: 

(1) 5+ is the set of all structured sorts constructed by S. 

(2) T is the set Un>o-^" of all function symbols. 

(3) V is the set Un>o^« predicate symbols. 

(4) is a set of sort declarations of functions and predicates such that: 

(i) If f & Vn, then /: si x . . . x s„ — >■ s G 17 where s\, . . . , s„, s € S — {-L}. 
In particular, if c & Tq, then c: — >■ s G 17. 

(ii) If p € Vn, then p: Si x . . . x s„ G 17 where Si, . . . , s„ G S — {-L}. In 
particular, if Ps G Vs, then p s'. V G 17. 

Note that the sort declarations of functions and predicates are given by atomic 
sorts. The structured sort signatures do not include subsort declarations. 

3.2 Sort-Hierarchy Declaration 

We will build a sort-hierarchy over instead of subsort declarations in sorted 
signatures of typical order-sorted logics. In our logic, we cannot enumerate all 
the subsort relations on 5+ because the set of subsort declarations representing 
a subsort relation may be infinite. Hence, we first give a finite set of subsort 
declarations, so that the subsort relation should be derived by a sort constraint 
system. For this purpose, we deal with subsort declarations as subsort formulas 
but not as static expressions in signatures. Let 77+ = {S^ ,V,V, 17) be a struc- 
tured sort signature. For s, s' G 5+, s Cg s' is said to be a subsort declaration 
over 77+ that indicates s is a subsort of s'. For instance, 

player □ winner Cs person 

is a subsort declaration over a structured sort signature. We denote by Ds+ = 
{s Es s' I s, s' G 5+} the set of all subsort declarations on 5+. In the next 
definition, the sort-hierarchy is obtained by a finite set of subsort declarations. 

Definition 3 (Sort-hierarchy declaration). A sort-hierarchy declaration 
is an ordered pair H = (5+, D), where 
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(1) 5+ is the set of structured sorts constructed by S, 

(2) D is a finite set {si Cs s^, S 2 Es s' 2 , • ■ •} of subsort declarations on . 

Extended declarations on 5+ are defined by subsort declarations as follows. 

Definition 4. A sort equivalence declaration, an exclusivity declaration and a 
totality declaration are defined respectively by 

- s =s s' iff s Es s' and s' Es s. 

- s II s' iff [s^ s') =s E. 

- s Is, s' iff{sUs') =s s,. 

We use the abbreviation s | s' to denote s |t s' . The above notations are 
useful for declaring complicated sort relations in a sort-hierarchy declaration 
H={S+,D). 

Example 2. The sort-hierarchy declaration H = (5+ , D) consists of the set 5+ 
of structured sorts constructed by 

S = {person, winner, loser, player, T, T} 

and the finite set D of subsort declarations with 

D = {winner Es player, player Es person, 

loser Es player, winner \piayer loser, winner || loser}. 

The sorts winner and loser are subsorts of player, and the sort player is a 
subsort of person. The totality declaration winner \piayer loser indicates that 
winner and loser have the property totality in player. The exclusivity declara- 
tion winner || loser indicates that winner and loser are mutually exclusive. 

3.3 Structured Sort Constraint System 

We develop a constraint system with respect to a subsort relation on 5+ . 

Definition 5. Let s, s', s" be structured sorts. The axioms and rules of struc- 
tured sort constraint system CS consist of: 

Refiexivity s Es s 

Idempotency s Es s n s, s U s Es s 

Commutativity s □ s' Es s' n s, s U s' Es s' U s 

Associativity (s □ s') □ s" =s s □ (s' □ s"), (s U s') U s" =s s U (s' U s") 
Distributivity (sUs')ris" =s (sns")U(s'ris"), (sns')Us" =s (sUs")n(s'Us") 

Least and greatest sorts T Es s, s Es T 

Conjunction s □ s' Es s, s Es s n T 
Disjunction s Es s U s', s U T Es s 
Absorption (s □ s') U s Es s, s Es (s U s') n s 
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Classical negation s || s, s | s, s □ s' =s s U s', s U s' =s s □ s' 
Strong negation s ||~s, ~s Cg s 

S Es s' 5 ' Eg s" 

Transitivity rule s Cg s" 

s Es 5 ' 

Introduction rule s" □ s Cg s" □ s' 



Elimination rule 



s U s' Cg s U s" s II s' s II s" 

s' Es s" 



A derivation of an expression (a subsort declaration, or a clause which we will 
define) from a set of expressions is defined as follows. 

Definition 6 (Derivation). Let A a set of expressions. A derivation of F„ in 
a system X from A is a finite sequence Ei, E 2 , . . . , such that 

(i) Fi G A, 

(ii) Fi is an axiom of system X, or 

(Hi) Fi follows from Fj{j < i) by one of the rules of system X . 

We write A \~x F if F has a derivation from A in the system X. This notion 
of derivations can be used for the structured sort constraint system CS, and a 
clausal inference system which we will present. 



3.4 Sorted Terms and Formulas with Structured Sort Constraints 

An alphabet for an order-sorted first-order language £g+ of structured sort sig- 
nature i7+ contains the following: the set V = Use5-{J-} variables for 

all atomic sorts in 5 — {_L} (where Vs is a set of variables xi: s,X 2 - s, . . . for 
atomic sort s), the connectives - 1 , A, V, — >■, the quantifiers V, 3, and the auxiliary 
parentheses and commas. 

We give the expressions sorted term and formula for our order-sorted first- 
order language with structured sorts. 

Definition 7 (Sorted terms). Let F+ = (5+,F, F, 17) be an structured sort 
signature and let H = (5+, D) be a sort-hierarchy declaration. The set TERM^+^s 
of terms of sort s is defined by: 

(1) A variable x: s is a term of sort s. 

(2) A constant c: s is a term of sort s where c & Ti) and c: — >■ s G 17. 

(3) Lf ti, . . . ,tn are terms of sorts si, . . . , Sn, then f{t \, . . . , t„): s is a term of 

sort s where f G F„ and /: Si x . . . x — >■ s G 17. 

(4-) Lf t is a term of sort s' with D hcg s' Cg s, then t is a term of sort s. 

We denote by TERMx+ the set of all sorted terms Usg5-{J-} TERM^+^s- 

We define a structured sort substitution with respect to a subsort relation 
derivable in the constraint system CS. That is, the subsort declarations are 
obtained by an application of the rules from CS so that the substitution is 
defined via the subsort declarations. 
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Definition 8 (Structured sort substitution). A structured sort substitution 
is a function 6 mappinq from a finite set of variables to TERMy-+ where 0(x: s) ^ 
x: s and e(x: s) G TERMs+J- 

In the above definition none of the terms of sort _L can be substituted for vari- 
ables. If there do not exist subsorts s' of s such that s' _L, then the substitutions 
correspond to many-sorted substitutions (i.e. not order-sorted substitutions). 

Definition 9 (Sorted formulas). Let il) be a structured sort 

signature and let H = (S^,D) be a sort-hierarchy declaration. The set FORM^+ 
of sorted formulas is defined by: 

(1) If ti, ... An are terms of si, . . . , Sn , then p(ti, . . . , tn) is an atomic formula 
( or simply an atom) where p G and p: Si x . . . x s„ G 17, 

(2) If A and B are formulas, then {~^A), (AAB), {A\/ B), (A^B), (yx:sA), 
and (3x: sA) are formulas. 

We introduce literals in order to represent formulas in clause form. A positive 
literal is an atomic formula p{t\, . . . ,tn), and a negative literal is the negation 
-<p(ti , . . . ,t„) of an atomic formula. A literal is a positive or a negative literal. 

Definition 10. Let Li, . . . , be literals. The formula Li V . . . V L„(n > 0) is 
said to be a clause. We denote by CL^+ the set of all clauses. 

3.5 J7+-structure 

As in the semantics of standard order-sorted logics, we consider a structure that 
consists of the universe and an interpretation over 5+ A TiJV and satisfies the 
sort declarations of functions and predicates on S. The interpretation of atomic 
sorts is defined by subsets of the universe. Hence, the interpretation of structured 
sorts is constructed by the interpretation of atomic sorts and the operations of 
set theory. 

Definition 11. Given a structured sort signature = (5+, iF, P, 17), a 
structure is an ordered pair M+ = {U,I'^) such that 

(1) U is a non-empty set. 

(2) is a function on A IF AV where 

• 7+(s) C U (in particular, I^{T) = U and I~^{E) = %), 

I+{sAs') = I+{s) n /+(s'), 

/+(s U s') = J+(s) U /+(s'), 

7+(s)=/+(T) - /+(s), 

/+(~s) C/+(T)-/+(s), 

• X ■ • ■ X I~^(sn) -A I~^{s) where f G and /: si x . . . x s^ -A 
s G 17, 

® In order to substitute variables with terms of the subsorts, the set TEBMj^+ g of 
terms of sort s contain the terms of their subsorts obtained by subsort declarations 
that are derivable using a sort constraint system. 
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• C X ... X /+(s„) where p € Vn and p: si x . . . x s„ € f2 (in 

particular, I~^{ps) = where Ps G Vs and Ps'. T G 

A variable assignment (or simply an assignment) in a A+-structure M~^ = 
(/+, U) is a function a: V — >■ C/ where a{x: s) G I^{s) for all variables x:s £ V. 
Let a be an assignment in a A+-structure M+ = (J+, U), let x: she & variable 
in V, and d G I'^{s). The assignment a[d/x\s\ is defined by a[d/x\s\ = {a — 
{(x: s, a{x\ s))}) U {(a;: s, d)}. 

We now define an interpretation over structured sort signatures . If an 
interpretation consists of a A^-structure M'*' and an assignment a in M+, 
then 1+ is said to be a ^^-interpretation. 

Definition 12. Let 2^ = (M~^,a) be a -interpretation. The denotation | 
is defined by 

(1) Ia^:sL = a{x:s), 

(2) |c: s]^ = I^{c) with I~^{c) G /“''(s), 

(3) = /+(/)(ItiL,---,[t«y. 

We formalize a satisfiability relation indicating that a ^^-interpretation sat- 
isfies sorted formulas and subsort declarations. 

Definition 13. Letl'^ = {M'^,a) be a -interpretation and let F be a sorted 
formula or a subsort declaration. We define the satisfiability relation I \= F by 
the following rules: 

(1) X+ iff ilti]^, . . . ,ltnjj G I+{p), 

(2) 1+ h iffl+ ^ A 

(3) 1+ h A B) iffl+ h A and T+ ^ B, 

(i) 1+ h U V B) iffl+ ^ A orI+ ^ B, 

(5) 1+ ^ {A^B) iffX+ A orX+ (= B, 

(6) X^ 1= (Vx: s)A iff for all d G I~^{s), I'^[d/x: s] ^ A holds, 

(7) X^ ^ (3cc: s)A iff for some d G I^{s), I^[d/x: s] ^ A holds, 

(8) 1+ hsEs s' iffI+{s)CI+{s'). 

Let F be a sorted formula or a subsort declaration and let F C FORM^+ U Ds+ . 
If 1+ 1= F, then is said to be a A^-model of F. We denote 1+ |= F if 
F+ \= F for every F G F. If F+ \= F, then F+ is said to be a F+-model of F. If 
F has a F+-model , then F is F+-satisfiable. If F has no F+-model , then F is 
F+-unsatisfiable. If every ^^-interpretation F+ is a F+-model of F, then F is 
said to be i7+-valid. We write F F (F is a consequence of F in the class of 
F+-structures) if every F+-model of F is a F+-model of F (g F0KM^+ LlDs+). 

Let F[ = (S^,D) be a sort-hierarchy declaration and A a set of clauses. In 
the clausal inference system we will present in the next section, their rules are 
applied to clauses in A (which expresses an assertional knowledge base) , related 
to a subsort relation derivable from D. If F”*" is a F^-model of both D and Z\, 
then F+ is said to be a F+-model of {D, A), denoted by F+ \= {D, A). We write 
{D, A) F (F is a consequence of {D, A) in the class of F+-structures) if 
every F+-model of (D,A) is a F+-model of F (g FORMx:+ U Ds+). 
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4 Resolution with Structured Sorts 

In addition to structured sort constraint system CS, we design a (clausal) reso- 
lution system with structured sorts. We adopt the method (proposed in [3]) of 
coupling a clausal knowledge base [13,11] and a sort-hierarchy in which every 
sort can be used to express the sort predicate which is included in clauses. Then, 
we define a hybrid inference system in order to combine the two systems. 

4.1 Clausal Inference System with Sort Predicates 

We present a clausal inference system, in which clauses may include sort predi- 
cates, e.g., p{ti,t 2 ) V s{t) where s is a sort predicate. 

Definition 14 (Cut rule). Let L,L' be positive literals and C,C clauses. 
-nLVC L'yC 

{cyC)e 

where there exists a mgu 0 for L and L' . 

The cut rule is one of the usual rules included in clausal inference systems. In 
addition to the cut rule, our clausal inference system have to include inference 
rules of sort predicates related to subsort declarations. We introduce the infer- 
ence rules for resolution as follows. 

Definition 15 (Resolution rules with sort predicates). Let s, s', Si be struc- 
tured sorts or sort predicates, L, L' positive literals, t, t' sorted terms, and C, C 
clauses. Resolution rules with sort predicates are given as follows. 

Subsort rule. 

-'s{t) V C s'{t') V C' s' Qs s 

(cVcy 

where there exists a mgu 0 for t and t' . 

Sort predicate rule. ^ 

~'s{t) V C s' Cg s 

C 

where t € TEKM^+ g,. 

Exclusivity rule. 

s{t) V C s' ft') y C' s II s' 

{CyC')0 

where there exists a mgu 0 for t and t' . 

* Instead of the sort predicate rule, the subsort rule can derive the same results by 
adding valid atoms with sort predicates. 
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Totality rule. 

Si{t) V C -•s{t') y C s Is- s' 

{s'{t)y cy c')9 

where there exists a mgu 9 for t and t' . 

In particular, the exclusivity rule and the totality rule are useful for resolutions 
with respect to implicit negations embedded in a sort-hierarchy. The exclusivity 
rule will be applied when an opposite sort is declared as s || s' . We write resolu- 
tion system MS for the system defined by the cut rule in Definition 14 and the 
resolution rules in Definition 15. 



4.2 Hybrid Inference System 

with Clauses and Structured Sort Constraints 

We define a hybrid inference system obtained by combining a clausal inference 
system with a sort constraint system. The inference rules in the hybrid system 
are applied to subsort declarations and clauses including sort predicates, so that 
they can deal with sort-hierarchy information in an assertional knowledge base. 

Definition 16 (Hybrid inference system). A hybrid inference system is a 
system obtained by adding the axioms and rules in a constraint system into a 
clausal inference system. We write X-\-Y for the hybrid inference system obtained 
from a clausal inference system X and a constraint system Y . 

The hybrid inference system X-\-Y can be regarded as an extension of the clausal 
inference system X. We write {D, A) \~x-t-Y F to denote D\J A \~x+y F. 

Lemma 1. The axioms of the structured sort constraint system CS are 17+- 
valid. 



Lemma 2. Let F, Fi, . . . , Fn be subsort declarations. The conclusion F of each 
rule in the structured sort constraint system CS is a consequence of its premise 
{Fi, . . . , Fn} in the class of -structures. That is, {Fi, . . . , F„} \=x+ F . 

Proof. Elimination rule: Suppose that I~^{s) U = I~^{s) U I+(s"), I^{s) fl 

/+(s') = 0 , and I~^{s) fl I~^{s") = 0 . Let d G I^(s'). Since I~^{s') Q U 
C M{s) U /+(s"), we have d G I~^{s) U /+(s"). d G /(s') and M{s) fl 
/+(s') = 0 imply d ^ I~^{s). Therefore d G I~^{s"). Similarly, the other rules can 
be proved. I 

Lemma 3. Let F, Fi, . . . , F„ be clauses or subsort declarations. The conclusion 
F of each rule in the resolution system RS is a consequence of its premise 
{Fl, . . . , Fn} in the class of -structures. That is, {Fi, . . . , F„} \=x+ F . 



Proof. For each rule we show {Fi, . . . , F„} ^ 2 :+ F. 
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1. Exclusivity rule: Suppose that 1+ ^ s(t) V C, 1+ ^ s'(t') V C", and I~^{s) fl 
/+(s') = 0. Let 0 be a structured sort substitution such that 9{t) = 0{t'). 
So 1+ h {s{t) V C)e and 1+ ^ (s'(t') V C')9. By /+(s) n /+(«') = 0, either 

\= s{t)9 or 1+ \= s'{t')9 does not hold. By the hypothesis, I'*' \= C9 or 
T+ ^ C'9. Therefore T+ ^C9y C'9 _ 

2. Totality rule: Assume that I'*' \= Si(t)VC', \= s'(t')'9C', andl+ \= s |s^ s', 

i.e. /■'■(s)U/+(s') = J+(si). Let 0 be a structured sort substitution such that 
9{t) = 9{t'). If J+(s) U/+(s') = /+(sj), then 1+ |= s{t) V s'(t') V C. Then, we 
can obtain \= s' {t)9y C9\/ C 9 . Therefore the conclusion is a consequence 
of its premise. I 

The next theorem shows the soundness of the structured sort constraint 
system CS and the resolution system RS. 

Theorem 1. Let H = (S^,D) be a sort-hierarchy declaration, A a set of 
clauses, and X a system. If {D, A) \~x F , then {D, A) F . 

Proof. By Lemma 1, 2, and 3, this is proved. I 

We give the notion of contradiction in an exclusivity relation from the sort- 
hierarchy. This notion is defined by deciding whether there is a contradiction 
between an opposite sort and its antonymous sort. 

Definition 17. Let H = be a sort-hierarchy declaration, A a set of 

clauses, and X a system. {D, A) is said to be contradictory on an exclusivity 
relation if there exists sorts s, s' such that {D, A) \~x s || s' and {D, A) \~x s{t) 
and {D,A) \~x s' ft). {D,A) is said to be logically contradictory if{D,A) \~x A 
and {D, A) \~x ~'A. 

The contradiction between A and -'A (corresponding to “logically contradictory” 
in the above definition) is defined in the usual manner of logics. We say that 
(D, A) is consistent if {D, A) is neither contradictory on an exclusivity relation 
nor logically contradictory. 

Theorem 2. Let H = , D) be a sort-hierarchy declaration and A a set of 

clauses. If{D,A) has a -model, then {D,A) is consistent. 

Proof. Suppose that 1+ is a L7+-model of (D,A). If (D,A) is contradictory on 
an exclusivity relation, then there exists s, s' such that {D, A) \~x s || s' and 
(D,A) \~x s{t) and (D,A) \~x s'{t). By Theorem 1= s || s' and then 

1+ 1= sff) and 1+ )= s' ft). Then /“'■(s) fl I~^{s') = 0 but |t]^ G I'^i.s) and 

|t]o, G d+(s'). If (D,A) is logically contradictory, then 1+ \= -lA and 1+ |= A. 

Hence, the both cases are contradiction to the hypothesis. Therefore {D, A) is 
consistent. I 

A refutation is a derivation of the empty clause (denoted □) from (D,A), 
written as {D, A) \~x LI. The next corollary guarantees that the hybrid inference 
system CS + RS is sound. 

Corollary 1. Let H = {S^,D) be a sort-hierarchy declaration and A a set of 
clauses. If{D,A) \~cs+RS L, then (D,A) \=x+ L. 
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Proof. When the empty clause □ is derived, the final rule applied in the refutation 
must be one of the rules in the resolution system RS. We consider each case as 
follows: 

1. Cut rule: There exists a structured sort substitution 9 such that L9 = L'9, 
and {D, A) hcs-i-ijs -•L and {D, A) hcs-i-ijs L'. So, by Theorem 1, we have 
{D, A) |=i;+ -'L and {D, A) L'. Now assume that is a T'+-model 
of {D, A). Then 1+ \= L9 and ^ L'9{= L9) contradicts our assumption. 
Since {D, A) has no i7+-model, {D, A) □ is proved. 

2. Resolution rules: Similar to 1. I 

5 Conclusions 

This paper has presented an order-sorted logic that can deal with implicit nega- 
tions in a sort hierarchy. We have presented a hybrid inference system that 
consists of a clausal inference system and a structured sort constraint system. 
This system includes structured sort expressions composed of atomic sorts, con- 
nectives, and negative operators, in order to deal with implicitly negative sorts 
embedded in a sort-hierarchy. To represent these negative sorts, we have pro- 
posed the notions of sort relations (subsort relation, equivalence relation, exclu- 
sivity relation, and totality relation) on the structured sorts, and we have ax- 
iomatized the properties of implicitly negative sorts. Thus, the structured sort 
constraint system can derive relationships between classical negation, strong 
negation, and antonyms in a sort-hierarchy. Furthermore, the contradiction in 
the sort-hierarchy as defined by the exclusivity relation enables us to prove the 
soundness of our logic with structured sorts. 

We need to improve our hybrid inference system in order to tackle imple- 
mentation issues caused by the complicated sort expressions. As a work which 
remains theoretical, the complete system must be given by revising the axioms 
and rules. 
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Abstract. Recent development of logic programming languages based 
on linear logic suggests a successful direction to extend logic program- 
ming to be more expressive and more efficient. The treatment of formulas- 
as-resources gives us not only powerful expressiveness, but also efficient 
access to a large set of data. However, in linear logic, whole resources 
are kept in one context, and there is no straight way to represent com- 
plex data structures as resources. For example, in order to represent an 
ordered list and time-dependent data, we need to put additional indices 
for each resource formula. This paper describes a logic programming lan- 
guage, called TLLP, based on intuitionistic temporal linear logic. This 
logic, an extension of linear logic with some features from temporal log- 
ics, allows the use of the modal operators ‘O’(next-time) and ‘□’(always) 
in addition to the operators used in intuitionistic linear logic. The intu- 
itive meaning of modal operators is as follows: O B means that B can be 
used exactly once at the next moment in time; □ B means that B can 
be used exactly once any time; ! B means that B can be used arbitrarily 
many times (including 0 times) at any time. We first give a proof theo- 
retic formulation of the logic of the TLLP language. We then present a 
series of resource management systems designed to implement not only 
interpreters but also compilers based on an extension of the standard 
WAM model. 



1 Introduction 

Linear logic was introduced by J.-Y. Girard in 1987 [4] as a resource-conscious 
refinement of classical logic. Since then a number of logic programming languages 
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based on linear logic have been proposed: LO[l], ACL[12], Lolli[3][8][9], Lygon[5], 
Forum[13], and LLP[2][15]. 

These languages suggest a direction to extend logic programming to be more 
expressive and more efficient. The treatment of formulas-as-resources gives us 
not only powerful expressiveness, but also efficient access to a large set of data. 
However, in linear logic, whole resources are kept in one context, and there is 
no straight way to represent complex data structures as resources. For example, 
in order to represent an ordered list and time-dependent data, we need to put 
additional indices for each resource formula. 

Temporal Linear Logic (TLL) is an extension of linear logic with some fea- 
tures of temporal logic. TLL was first studied by Kanovich and Itoh [11], and 
a cut-free sequent system has been proposed by Hirai [6] . The semantics model 
of TLL consists an infinite number of phase spaces linearly ordered by the time 
clock. Each phase space is the same as that of linear logic. 

This paper describes a logic programming language, called TLLP, based on 
intuitionistic temporal linear logic [6] . We first give a proof theoretic formulation 
of the logic of the TLLP language. We then present a series of resource man- 
agement systems designed to implement not only interpreters but also compilers 
based on an extension of the standard WAM model. Finally, we describe some 
implementation methods based on our systems. 

2 Intuitionistic Temporal Linear Logic 

In this paper, we will focus on the sequent system ITLL [6] of intuitionistic 
temporal linear logic developed by Hirai. The expressive power of ITLL is shown 
by a natural encoding of Timed Petri Nets. It is this logic that we shall use to 
design and implement the logic programming language described below. 

ITLL allows the use of the modal operators ‘O’(next-time) and ‘□’(always) 
in addition to the operators used in intuitionistic linear logic. Compared with the 
sequent system ILL (see Fig. 1) of intuitionistic linear logic, three rules (Lg), 
(R □), and (O) are added. The entire set of ITLL sequent rules is given in Fig. 2. 
Here, the left-hand side of sequents are multisets of formulas, and the structural 
rule for exchange need not be explicitly stated. The structural rule for weakening 
(W !) and contraction (C !) are available only for assumptions marked with the 
modal operator ‘!’. This means that, in general, formulas not !-marked can be 
used exactly once. Limited-use formulas can represent time-dependent resources 
in ITLL. The intuitive meaning of these modal operators is as follows: 

— OB means that B can be used exactly once at the next moment in time. 

— n B means that B can be used exactly once any time. 

— ! B means that B can be used arbitrarily many times (including 0 times) at 

any time. 

By combining these modalities with binary operators in linear logic, several 
resources can be expressed. For example, B SzO B means that B can be used 
exactly once either at the present time or at the next moment in time. 0(1 & H) 
means that B can be used at most once at the next moment in time. 
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A,B 

A,aB — >C 



(Ln) 



(Rules of ILL) 



\r,as 

! r, □ r — > ETc 



(Rd) 



\r,aI!,A — >C 
ir,aE,OA — >OC 



(O) 



Fig. 2. The proof system ITLL for intuitionistic temporal linear logic 



Two formulas B and C are equivalent, denoted B = C,if the sequents B — > 
C and C — 1- B are provable in ITLL. The notation O" means n multiplicity of 
O. We note the following sequents that are provable in ITLL. 

\B = \\B, aB = aaB, \B = a\B, 

IB — >uB'S)---'^aB, nB — > B (n > 0) 

The main differences from other temporal linear logic systems [11] [16] are 
that ITLL includes the modal operator and it satisfies a cut elimination 
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theorem. Both of these additions are very important for the design of a language 
based on the notion of Uniform Proofs. 

3 Language Design 

The idea of uniform proofs [14] , proposed by Miller et. al, is a simple and powerful 
notion for designing logic programming languages. Uniform proof search is a cut- 
free, goal-directed proof search in which a sequent P — > G denotes the state of 
the computation trying to solve the goal G from the program P. Goal-directed 
proof search is characterized operationally by the bottom-up construction of 
proofs in which right-introduction rules are applied first and left-introduction 
rules are applied only when the right-hand side is atomic. This means that the 
operators in the goal G are executed independently from the program P, and 
the program is only considered when its goal is atomic. A logical system is an 
abstract logic programming language if restricting it to uniform proofs retains 
completeness. The logics of Prolog, AProlog, and Lolli are examples of abstract 
logic programming language. 

Clearly, intuitionistic linear logic (even over the connectives: T, &, 0, — o, !, 
and V) is not an abstract logic programming language. For example, the sequents 
a®h — > b®a and ! a & 6 — > ! a are both provable in ILL but do not have uniform 
proofs. 

Hodas and Miller have designed the linear logic programming language Lolli 
[7] [8] by restricting formulas so that the above counterexamples do not appear, 
although it retains desirable features of linear logic connectives such as ! and ®. 
The Lolli language is based on the following fragment of linear logic: 

A ::= T I A I A1&A2 I G^R \ G^R\ ^x.R 

G ::= 1 I T I A I Gi ®G2 I G1&G2 I Gi ©G2 I R^G\ R^G\ !G | V®.G | 3 x.G 

Here, i?-formulas and G-formula are called resource and goal formulas respec- 
tively. The connective is called intuitionistic implication, and it is defined as 
B^C={\B)^C. 

The sequent of Lolli is of the form T ; A — >■ G where T is a set of resource 
formulas, Z\ is a multiset of resource formulas, and G is a goal formula. P and 
A are called intuitionistic and linear context respectively, and they correspond 
to the program. G is called the goal. The sequent P ; A — > G can be mapped 
to the linear logic sequent ! P, A — > G. Thus, the right introduction rule for 
-o adds its assumption (called a linear resource) to the linear context, in which 
every formula can be used exactly once. The right introduction rule for ^ adds 
its assumption (called an intuitionistic resource) to the intuitionistic context, in 
which every formula can be used arbitrarily many times (including 0 times). 

Hodas and Miller developed a series of proof systems £ (see Fig. 3) and £' 
in [7]. They proved that £ is sound and complete with respect to the ILL rules 
restricted to the Lolli language. They also proved £ preserves completeness even 
if probability is restricted to uniform proofs. £' is the proof system that results 
from replacing the Identity, L— o, L=>, L&, and LV rules in £ with a single rule, 
called backchaining. 
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In this paper, we will use a more restrictive definition for resource and goal 
formulas. Let A be atomic and m > 1: 

R ’■'■= & • • • & Sm 

5 ;:= T I ^ I I Vx.S 

G ;:= 1 I T I A I Gi ®G 2 I G 1 &G 2 I Gi ©G 2 I R^G\ S'^G | !G | Vx.G | 3x.G 

Here, S'-formulas are called resource clauses in which A and G are called the head 
and the body respectively. S'-formulas correspond to program clauses. Although 
this simplification does not change expressiveness of the language, it makes the 
presentation of hackchaining simpler, as is discussed below. 

Since full intuitionistic linear logic is not an abstract logic programming 
language, it is obvious that intuitionistic temporal linear logic is not as well. For 
example, in addition to the counterexamples in ILL, the sequents □ O a — > O a, 

! O a — > O a, and a & O a — > O a are all provable in ITLL, but they do not 
have uniform proofs. 

Fig. 4 presents a proof system TC for the connectives T,&,— o,=>, V, 1,!,®, 
©, 3, O, and □. Two rules, Ln and O, are added in addition to those that arise 
in L. This system has been designed to support the logic programming language 
TLLP over the following formulas: If A is atomic and m > I, 

R ’■■= S': & • • • & Sm I □(5'i & • • • & Sm) I O R 
S' ::= T I A I G^ A I Va;.S 

G 1 I T I A I Gi ®G 2 I G 1 &G 2 I Gi ©G 2 I R^G \ S^G | !G | Mx.G \ 3x.G \ OG 
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(Rules of C) 



r-A,B^c r-,au,A^c 

r-,A,aB ^ ’ r-,aE,OA^OC ^ ’ 



Fig. 4. T jC: a proof system for the connectives T, &, -o, =^>, V, 1, !, (g), ©, 3, O, and □. 

Let D be a &-product of resource clauses Si Sz - ■■ Sz Sm- Compared with Lolli, 
O” D and O” □ D are added to resource formulas, and O G is added to goal 
formulas. The intuitive meaning of these formulas is as follows: O" D means 
that the resource clause Si (1 < f < m) in D can be used exactly once at time n; 
O" □ D means that the resource clause Si (1 < z < m) in £> can be used exactly 
once any time at and after time n; O G adjusts time one clock ahead and then 
executes G. 

The proofs of propositions in this paper are based on Hodas and Miller’s 
results in [7] for the Lolli language, and we will only give proof outlines. 

Proposition 1. Let G be a goal formula, F a set of resource clauses, and A a 
multiset of resource formulas. Let D* be the result of replacing all occurrences 
of B ^G in D with (! B) — o G, and let F* = {B* \ B G F}. Then the sequent 
F;A — > G is provable in TG if and only if \{F*),A* — > G* is provable in 
ITLL. 

Proof (sketch). The proof of this proposition can be shown by giving a simple 
conversion between proofs in the two systems. The cases of O and Ln are also 
immediate. □ 



Proposition 2. Let G be a goal formula, F a set of resource clauses, and A a 
multiset of resource formulas. Then the sequent F; A — > G has a proof in B C 
if and only if it has a uniform proof in BC. 



Proof (sketch). The proof in the reverse direction is immediate, since a uniform 
proof in TG is a proof in TG. The forward direction can be proved by showing 
that any proof in ‘FC can be converted to a uniform proof of the same endsequent 
by permuting the rules to move occurrences of the left-rule up, though, and above 
instances of the right-rule. We explicitly show one case, that is when Ln occurs 
below R&: ^ ^ 

— 1 —2 



G; A, B ^ Cl F-A,B^ C2 
G;A,B^Gi&G2 
r;A,DB^Gi&G2 ^ ° 



(R&) 



where and . 2:2 are uniform proofs of their endsequents respectively. The 
above proof structure can be converted to the following: 



r-A.B- 



Ci 

F;A,aB — >C\ 



(Ln) 



r-,A,B- 

F-A,dB- 



C 2 

C 2 



r-,A,aB ^CikC2 



(Ln) 

(R&) 
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□ 

As with £ and the left-hand rules can be restricted to a form of backchain- 
ing. Let us consider the following definition: Let i? be a resource formula. ||i?|| 
is defined as a set of resource clauses (S'-formulas): 

1. if i? = A then ||i?|| = {A}, 

2. if i? = G — o A then ||i?|| = {G —o A}, 

3. if i? = Vx.S then for all closed terms t, ||i?|| = ||5'[t/x]||, 

4. A R = Si t - ■■ kSm then ||i?|| = ||S'i|| U • • • U ||S'm||, 

5. HR=nR' then ||i?|| = ||i?'||, 

6. if i? = O i?' then ||i?|| = 0 

Let T£' be a proof system that results from replacing the Identity, absorb, 
L— o, L=l>, L&, LV, and Ln rules in T£ with the backchaining rules in Fig. 5. 
These backchaining rules (especially the definition of || • ||) are simpler than the 
original rule for Lolli because of the restrictive definition of resource formulas. 
It is noticed that the absorb rule is integrated into (BC !i) and (BC 12 ). 

Proposition 3. Let G be a goal formula, R a set of resource clauses, and A a 
multiset of resource formulas. Then the sequent £; A — ^ G has a proof in T C 
if and only if it has a proof in R C . 

Since uniform proofs are complete for R £, this proposition can be proved by 
showing that there is a uniform proof in RC if and only if there is a proof in 
R C' . We do not present the proof here. A similar proof has been given by Hodas 
and Miller in [7] for the Lolli language. 



(BCi 



(BC!i 



r-D — r,D -% — ^A 

provided, in each case, A is atomic and A G ||7?||. 



RA 



(BC2 



RD-A 



G 



(BC!2 



r-A,D — >A RD-A^A 

provided, in each case, A is atomic and G —o A G || 7 ?||. 



Fig. 5. Backchaining for the proof system R C' 



3.1 TLLP Example Programs 

We will present simple TLLP examples here. For the syntax, we use ‘ : for the 

inverse of — o, ‘ for 0, ‘- 0 ’ for — o, ‘=>’ for =i>, for O, and ‘#’ for □. 

We first consider a Lolli program that finds a Hamilton path through the 
complete graph of four vertices. Since each vertex is represented as a linear 
resource, the constraints such that each vertex must be used exactly can be 
expressed. 
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p(V,V, [V]) v(V) . 

p(U,V,[U|P]) v(U), e(U,W), p(W,V,P) . 

e(U,V) . 

goal(P) v(a) -<> v(b) -<> v(c) -<> v(d) -<> p(a,d,P). 

When the goal goal(P) is executed, the vertices are added as resources, and the 
goal p(a,d,P) will search a path from a to d by consuming each vertex exactly 
once. 

In addition to the resource-sensitive features of Lolli, TLLP can describe the 
time-dependent properties of resources, in particular, the precise order of the 
moments when some resources are consumed. For example, #v(a) denotes the 
vertex a that can be used exactly once at and after present. 0 #v(c) denotes 
the vertex c that can be used exactly once at and after the next moment in 
time. So, the following TLLP program finds a Hamilton path that satisfies such 
constraints. It is noticed that time is adjusted one clock ahead every time the 
path crosses an arc. 

p(V,V, [V]) v(V) . 

p(U,V,[U|P]) v(U), e(U,W), @p(W,V,P) • 

e(U,V) • 

goal(P) #v(a) -<> @ @v(b) -<> @ #v(c) -<> #v(d) -<> p(a,d,P) . 

Our next example is a simple Timed Petri Nets reachability emulator. This 
program checks the reachability of a Timed Petri Net in Fig. 6 from the initial 
marking (one token in p) to the final marking (one token in p and two tokens in 
r). Each di, a non-negative integer, is the delay time for the transition ti. 

tpn #p -<> (goal p, r, r) => tpn(l, 100). 
tpn(Dep, Lim) Dep =< Lim, fire(Dep). 

tpn(Dep, Lim) Dep =< Lim, Depl is Dep + 1, tpn(Depl, Lim). 

next(D) Dl is D - 1, Dl > 0, fire(Dl). 

fire(D) goal. 

fire(D) p, @ #p -<> 0 #q -<> next(D). 

fire(D) q, q, q, @ #r -<> next(D). 

fire(D) @next(D). 

Since the proof search of TLLP is depth- first and is not complete, we use a 
iterative deepening search, a combination of depth-fisrt and breadth-first search. 
First, the predicate tpn (Dep, Lim) checks the reachability at depth 1, and then 
it increases the depth by one if the check fails. 

Besides the examples presented above, the latest TLLP package includes 
programs for the flow-shop scheduling problem and Conway’s Game of Life. 
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Fig. 6. An example of Timed Petri Nets 



4 Resource Management Model 

The resource management during a proof search in 'T C' \s & serious problem for 
the implementor. Let us consider, for example, the execution of the goal Gi®G2~- 

F; Ai — > Gi F; A 2 — > G 2 
F;Ai,A2^Gi(^G2 ® 

Z\ 

When the system applies this rule during bottom-up search, the linear context 
A must be divided into Z\i and Z\2- If A contains n resource formulas, all 2” 
possibilities might need to be tested to find a desirable partition. 

For Lolli, Hodas and Miller solved this problem by splitting resources lazily, 
and they proposed a new execution model called the I/O model [8]. 

In this model, the sequent I {G} O means that the goal G can be executed 
given the input context I so that the output context O remains. The input and 
output context, together called /0-context, are lists of resource formulas, !- 
marked resource formulas, or the special symbol 1 that denotes a place where a 
resource formula has been consumed. In the the execution of the goal Gi O G2: 

/{Gi}M M{G2}0 

/{GiOG2}0 ^ ^ 



First, I {Gi}M tries to execute Gi given the input context I. If this succeeds, 
the output context M is forwarded to G2, and then M {G2} O is attempted. 
If this second attempt fails, I {Gi} M retries to find a different, more desirable 
consumption pattern. 

We will extend the I/O model for the TLLP language. The additional prob- 
lem here is that the bottom-up application of the rule for O in 'T C requires 
manipulating large dynamic data structures. 



F-uE,A^G 
F-uS,OA — ^ OG 



(O) 



For example, when the system executes the goal O G given input context / = 
[p, O (7, O O r, ! s], we need to reconstruct and create a new input context F = 
[1, q,Or,\ s] before the execution of the goal G. 

We introduce a time index to solve this problem. Fig. 7 presents an extension 
of the I/O model for the TLLP language, called lOT. lOT makes use of a time 
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( 1 ) 



subcontextriO, I) 



I{Gi}tM M{G2}tO 
I{Gi(E)G2}tO 



(T) 



I{T}tO 
I{Gi}tO I{G2}tO 
/{Gi&G2}tO 
[(!5,0) |/]{G}t[(!S,0) |0] 



/{Gi©G2}tO ^ ' I{S^G}tO 

[{R,T + n) |7]{G}t[1|0] 



(&) 
(^) 



I{O^R^G}tO 

provided that _R is a formula of the form: & 



I{G}tI 
I{\G}tI 
pickj,{I, O, A) 
I{A}tO 



(!) 

(BCi) 



■ ■■ &iSm or □(Si & • • • & Sm)- 
I{G}t+iO 



(O) 



I{OG}tO 
pickj,{I,M,G^A) M{G}tO 
I{A}tO 



(BC2 



Fig. 7. lOT: An I/O model for propositional TLLP 



index T. The sequent is of the form I {G}tO. T, non-negative integer, is the 
current time. At a given point in the proof, only resources that can be used at 
that time may be used. T is also used to set a consumption time of newly added 
resources. 

Each element in /OT-context is a pair {R, t) where i? is a resource formula or 
[-marked resource formula, and t is its consumption time, or the special symbol 1 . 
Linear resources have the form (Si & ... & S^, t) or (□(Si & ... & Sm),t), where 
t is its consumption time calculated from the value of T, and its multiplicity of 
O. Intuitionistic resources have the form (!S, 0), where S is a resource clause. 
For example, the consumable resources at time T have the following forms in 
the context: (Si & ... & Sm, T), (d(Si & ... & Sm),t) where t <T, and (! S, 0) 

The relation pickj,[I, O, S) holds if S occurs in the context I and is consum- 
able at time T, and O results from replacing that occurrence of S in / with 1. 
The relation also holds if IS occurs in I, and I and O are equal. The relation 
subcontextriO , I) holds if O arises from replacing arbitrarily many (including 0) 
non-!-marked elements of I that are consumable any time at and after time T 
with 1. 

To prove that lOT is logically equivalent to T C , we need to define the 
notion of difference I —pO for two /OT-context I and O that satisfy the relation 
subcontextriO,!). I ~r O is a pair (T, Z\), where T is a set of all formulas S 
such that (! S, 0) is an element of I (and O), and Z\ is a multiset of all formulas 
Qmax(o.i-T) ^ such that {R,t) occur in I (If R is of the form Si&---&Sm, 
then t > T. If R is of the form g(Si & • • • & Sm), then t is arbitrary), and the 
corresponding place in O is the symbol 1. 
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Proposition 4. Let T be a non-negative integer. Let / and O be /OT-contexts 
that satisfy subcontextriO, I). Let I ~t O be the pair {F, A) and let G be a goal 
formula. I {G}t O is provable in lOT if and only if F; A — G is provable in 
TC. 

Proof (sketch). This proposition, in both directions, can be proved by induction 
on proof structure. □ 



5 Level-Based Resource Management Model 

The I/O model provides an efficient computation model for proof search. The 
I/O model has been refined several times. Cervesato et. al recently have proposed 
a refinement designed to eliminate the non-determinism in management of linear 
context involving & and T [3]. However, the I/O model and its refinements still 
require copying and scanning large dynamic data structures to control the con- 
sumption of linear resources. Thus, they are more suited to develop interpreters 
in high-level languages rather than compilers. 

We point out two problems here. First, during the execution of I {G} O 
(especially pickR), the context O is reconstructed from the context / by replacing 
linear consumed resources with 1. This will slow down the execution speed. 
Secondly, let us consider the execution of the goal Gi & G 2 : 

/{GijO I{G,}0 
/{Gi&G2}0 

This rule means that the goal Gi and G 2 must use the same resources. In a 
naive implementation, the system first copies the input context and executes the 
two conjuncts separately, and then it compares their output contexts. This leads 
to unnecessary backtracking. 

To solve these problems, Tamura et. al have introduced a refinement of the 
I/O model with level indices [10][15], called the lOL model Hodas et. al re- 
cently proposed the refinement of lOL for the complete treatment of T in [9]. 

lOL makes use of two level indices L and U to manage the consumption of 
resources. The sequent is of the form \~l,u I {G} O. L, a positive integer, is the 
current consumption level. At a given point in the proof, only linear resources 
labeled with that consumption level (and intuitionistic resources labeled with 0) 
can be used. U, a negative integer, is the current consumption maker. When a 
linear resource is consumed, its consumption level is changed to the value of U . 

Each element in /OGcontext is a pair (R,£), where i? is a resource formula, 
and i is its consumption level. Linear resources have the form {R,i), where i is 
the value of L at which the resource can be consumed. Intuitionistic resources 
have the form {S, 0) where S' is a resource clause. 

\-L,u-i I {Gi} M changeu_.^ ]^_^_i{M,N) \~l+i,u N {G 2 } O thinableL+i{0) 

^L,U /{Gi&G2}0 

In this paper, we use the notation in [10] to explain the lOL model. 



1 
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7{1}/ 



( 1 ) 



subcontext]j [^{0, /) 



(T) 



h-luI{Gi}M ^IuM{G2}0 
^I,u I{Gi<8)G2}0 



changeu_^]^j^^{M,N) \~"l+i,u ^ {G 2 } O thinableL+i{0) 



^l,u I{Gik.G2}0 



(&) 



^l,u I {Gi}0 

'^l,u I{Gi®G2}0 



^l,u [(5,0,0) |/]{G}[(S,0,0) |0] 



I {S^G}0 
^l,u [{R, T + n,L)\I] {G} [{R, T + n, G) | O] 
/{0"i?^G}0 



(^) 






provided that i? is a formula of the form: Si & • • • & Sm or n(Si & • • • & Sm)- 



^ 'l+i,u ^ {G} O 



hlu I V-G}0 

u 



(!) 



7{G}0 



(O) 



pickljj{I,0,A) 



hluHA}0 



(BCi 



H,uHOG}0 

pickluil, M,G^A) M {G} O 

H,uI{A}0 



(BC 2 



Fig. 8. lOTL: A level-based I/O model for propositional TLLP 



For example, the outline of the execution of the goal Gi & G 2 is as follows: 

1. 7{Gi}M Decrement U so that we know which resources are con- 
sumed during the execution of Gi, and then execute Gi. 

2. changejj_i Change the level of resources that have been con- 

sumed in Gi to L -I- 1. 

3. N {G 2 } O Increment L and U, and then execute G 2 . 

4. thinable L+i{0) Check whether none of resources in O have A -|- 1 as their 
consumption level. 

lOL is logically equivalent to C . In lOL, all resources are kept in a single 
table, called resource table, during execution. The consumption of resources can 
be achieved easily by changing their consumption level destructively. The idea 
of this model has already been used as a basis for a compiler system for a useful 
fragment of first-order Lolli, in which the resource table is implemented as an 
array, and the speed access to resources is achieved by using a hash table. 

For TLLP, we give a refinement of lOT, called lOTL in Fig. 8, with level 
indices of lOL. The sequent of lOTL is of the form u I{G}0, where T 
is the current time, L is the current consumption level, and U is the current 
consumption maker. 

Each element in LOTT^contexts is a tuple {R,t,£), where 7? is a resource 
formula, t is its consumption time, and £ is its consumption level. Linear resources 
have the form (^i & ... & Sm, t, £) or (□(S'l & ... & Sm),t, £), where t is calculated 





Logic Programming in a Fragment of Intuitionistic Temporal Linear Logic 327 



from the value of T and its multiplicity of O, and £ is the value of L at which 
the resource can be consumed. Intuitionistic resources have the form (S', 0,0), 
where S is a resource clause. 

When the system executes ^ J {G} O, the consumable resources in the 
context / have the following forms: (Si & ... & S^, S’, L), (□(Si & ... & L) 

where t <T, and (S, 0,0). 

The relation pickJ^ ij{I,M,S) selects a consumable resource clause S from 
the input context I. The output context M is the same as I, except that the 
consumption level of the selected clause is changed to the value of U if it is a 
linear resource. The relation change^ modifies the context M so that 

any resources in M with level i have their level changed to in the context 
iV. The relation thinablei{0) checks whether none of resources in O have £ as 
their consumption level. The relation subcontextjj j^{0 ^ I) then consumes some 
resources. The output context O is the same as I, except that the consumption 
levels of some resources are changed to the value of C/, if they are linear resources. 



6 Implementation Design 

In this section, we discuss implementation issues for the TLLP language. 

For Lolli, Hodas has developed the I/O model-based interpreters both in 
Prolog and SML. Tamura et. al have designed an extension of standard WAM 
model (called LLPAM) and have developed a compiler system This compiler 
system supports the first-order Lolli language, except for the goal Vx.G. LLPAM 
was first designed, based on lOL [10] [15], and recently refined with the top flag 
in CTZAi [9], for the complete treatment of T. LLPAM has also been improved 
to incorporate the resources compilation technique in [2]. 

The main differences between LLPAM and WAM is as follows: 

— Three new date areas are added: RES (the resource table), SYMBOL (the sym- 
bol table), and HASH (the hash table). HASH and SYMBOL are used for speed 
access to resources, and the entries in RES are hashed on the predicate symbol 
and the value of the first argument in the current implementation. 

— Six new registers are added: R, L, U, T, Rl, and R2. R is the current top of 
RES. L and U are the current values of L (current consumption level) and U 
(current consumption maker) in lOL respectively. T is the top flag in CTZM. 
Rl and R2 are used for picking up consumable resources quickly. 

— New instructions for newly added connectives are added. 

For TLLP, there seem to be at least three approaches to develop efficient 
implementations: TLLP interpreter, translator from TLLP to Lolli, and TLLP 
compiler. First, it is easy to implement a TLLP interpreter based on lOT and 
lOTL in high-level languages like Prolog, but the resulting systems are slow. 
Secondly, it is possible to translate TLLP programs into Lolli programs by adding 

^ The latest package (version 0.5) including all source code can be obtained from 
http : //bach. seg.kobe-u. ac . jp/llp/. 
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a new argument for the current time T in lOT to each predicate in Lolli. The 
drawback of this simple translation is that the goal T in TLLP can not be 
correctly translated into that in Lolli. Finally, it is also possible to extended 
LLPAM to support the TLLP language. We summarize important points that 
have been improved: 

— Two new fields time and box have been added to each entry in RES. The 
time field denotes the consumption time in lOTL. The box flag is set to false 
if the newly added resource is not prefixed by □, otherwise true. 

— A new register TI has been added. TI denotes the current time T in lOTL. 
Choice instructions such as try in WAM therefore need to set and restore 
the value of TI. TI is used to set the time field of newly added resource. TI 
is also used for hash key for speed access to the resources. 

— In LLPAM, the instruction add_res Aj, kj is used to add linear resource 
clauses, where is its head, kj is its closure that consists of the compiled 
code and a set of bindings for free variables. We replaced this instruction with 
two new instructions add_exact_timed_res A^, kj, n and add_timed_res 
ki, kj, n. The former is used to add a resource clause Si {1 < i < m) 
in 0”(S'i & • • • & S'm), where ki is its head, kj is its closure, and n is the 
multiplicity of O. The latter is used to add a resource clause S'i (1 < i < to) in 
O" □(S'l & • • • & Sm), Ai is its head, kj is its closure, and n is the multiplicity 
of O. 

— In LLPAM, the instruction pickup_resource p/n, ki, L finds a consumable 
resource with predicate symbol p/n hy checking its consumption level, and 
then it sets its index value to ki. If there are no consumable resources, it 
jumps to L. We need to improve this instruction so that it checks not only the 
level condition but also the time condition by comparing the consumption 
time (the time field) of resources with the current time (the current value 
of TI). 

The specification of LLPAM have been shown in the papers [9] and [2] . 

7 Conclusion and Future Work 

Recent development of logic programming languages based on linear logic sug- 
gests a successful direction to extend logic programming to be more expressive 
and more efficient. In this paper, we have designed the logic programming lan- 
guage TLLP based on intuitionistic temporal linear logic and have discussed 
some implementation issues for TLLP. The following points are still remaining: 

— TLLP supports a small fragment of intuitionistic temporal linear logic. 

— lOTL needs to be refined with the idea of top flag in CIZM [9] for the 
complete treatment of T. 

— The goal Vx.G is not supported in the current implementation. 

Currently, we have developed a prototype of TLLP compiler system based 
on the extension presented in this paper. The latest TLLP package (version 0.1) 
including TLLP interpreters, a translator from TLLP into Lolli, and a prototype 
compiler is available from http;//kcmiinari. scitec.kobe-u.ac.jp/tIIp/. 
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Abstract. This paper adds the handling of negative information to a 
functional-logic deductive database language. By adopting as semantics 
for negation the so-called CRWLF, wherein the negation is intended as 
’finite failure’ of reduction, we will define Herbrand algebras and models 
for this semantics and a fix point operator to be used in a new goal- 
directed bottom-up evaluation mechanism based on magic transforma- 
tions. This bottom-up evaluation will simulate the top-down one of the 
original program; in fact, it will carry out a goal-directed lazy evaluation. 



1 Introduction 

Deductive databases [17] are database management systems whose query lan- 
guage and, usually, storage structure are designed around a logical data model. 
They offer a rich query language which extends the relational model in many di- 
rections (for instance, support for non-first normal form and recursive relations) 
and they are suited for application in which a large number of data must be 
accessed and complex queries must be supported. 

With respect to the operational semantics, most deductive database systems 
(for instance, DATALOG [19], CORAL [16], ADITI [20]) use bottom-up evalua- 
tion instead of top-down one like Prolog systems. The reason is that the bottom- 
up approach allows to use a set-at-a-time evaluation, i.e. it processes sets of goals, 
rather than proceeding one (sub) goal at a time, where operations like relational 
joins can be made for disk-resident data efficiently. Therefore, when the program 
is data-intensive, this evaluation method is potentially much more efficient than 
top-down techniques. The idea of the goal-directed bottom-up evaluation is to 
generate by using the fix point operator [3] the subset of the Herbrand model of 
the program relevant to the query solving. With this aim, the bottom-up evalu- 
ation in such languages involves a query-program transformation termed Magic 
Sets [5], in such a way that a logic program-query is transformed into a magic 
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logic program-query whose bottom-up evaluation is devised to simulate the top- 
down one of the original program and query. The program is evaluated until no 
new facts are generated or the answer to the query is found. The transformed 
program adds new predicates, called magic predicates, whose role is to pass in- 
formation (instantiated and partially instantiated arguments in the predicates 
of the query) to the program in order to consider only those instances of the pro- 
gram rules relevant to the query solving. Several transformation methods have 
been studied in the past, for instance, Generalized Magic Sets [5], Generalized 
Supplementary Magic Sets [5] and Magic Templates [15]. 

The use of negation in deductive databases allows to increase their expressive 
power as query languages. The introduction of negation in logic programming 
(see [4] for a survey), and thus the study of semantic models of the so-called 
general logic programs, have been widely studied in the past. Most relevant se- 
mantics for handling of negation are the stable model semantics [6] and the 
well-founded semantics [21]. 

However, the incorporation of negation in the deductive languages implies 
new problems in the magic transformation. In fact, new magic transformations 
have been proposed pointing out the one presented for modularly stratified pro- 
grams [18] and the doubled program approach [10] by adopting both the well- 
founded semantics. The problem arises from the three valued nature of the 
magic predicates that result, and the well-founded model of the transformed 
magic program and the original one may disagree [10]. In [10] classes of side- 
ways information-passing strategies, also called sips, which ensure that the magic 
sets are two-valued, are defined. These sips, named well-founded sips, make sure 
that the well-founded model of a program is preserved w.r.t. the query in the 
transformed program. Moreover, they subsume the left-to-right sips intended for 
modularly stratified programs [18]. Finally, they present a new magic transfor- 
mation by using a doubled program technique which preserves the well-founded 
model w.r.t. the query regardless of the sips to be used. The drawback of this 
approach is that the bottom-up evaluation of the program must end. 

On the other hand, the integration of functional and logic programming has 
been widely investigated during the last years. It has led to the recent design of 
modern programming languages such as CURRY [9] and TOy [13]. The basic 
ideas in functional-logic programming consist in lazy narrowing as operational 
mechanism, following some class of narrowing strategy combined with some kind 
of constraint solving and higher order features, see [8] for a survey. 

In [1], a framework for goal-directed bottom-up evaluation for functional- logic 
programs without negation has been proposed. As in the logic paradigm, the 
bottom-up evaluation is based on a magic transformation for a given program- 
query into a magic program-query. In the cited paper, the semantics adopted for 
the programs is the Constructor Rased RelTriting Logic {GRWL) presented in 
[7]. This bottom-up evaluation is based on the use of a fix point operator over 
CRlUCHerbrand algebras and it simulates the demand driven strategy [11] for 
top-down evaluation of CRRL^programs. 
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Recently, a framework called Constructor Rased Re IFriting Logic with failure 
(CRWLF) has been presented in [14], extending the semantics CRWL in order 
to handle negative information and wherein the negation is intended as ’finite 
failure’ of reduction. The formulas that are provable in CRWL can also be proved 
in CRWLF but, in addition, CRWLF provides ’proofs of unprovability’ within 
CRWL. CRWLF can only give an approximation to failure in CRWL that corre- 
sponds to the cases in which unprovability refers to ’finite failure’ of reduction. 

The aim of this paper is to add the handling of negative information to a 
functional-logic language, and to present a goal-directed bottom-up evaluation 
mechanism in order to get a computational model for a “functional-logic” deduc- 
tive language. With this aim, we will replace the semantics CRWL used in [1], 
by the semantics CRWLF and we will present Herbrand algebras and models 
and a fix point operator which computes the Herbrand model of a CRWLF- 
program. Finally, we will propose an extension of our goal-directed bottom-up 
evaluation, based on a new magic transformation and the use of the defined fix 
point operator. 

As an example, a “functional-logic” deductive database can handle a boss 

hierarchical line as follows: 

(1) boss(jesus) — >■ jaime. (3) member(X, [])—>■ false. 

(2) boss(jaime) — > antonio. (4) member(X, [Y|L]) — > member(X, L) X ^ Y. 

(5) member(X, [Y|L]) — )■ true X ixi Y. 

(6) superboss(P) — >■ |boss(P)|superboss(boss(P))]. 

Goal : ^member(jaime, superboss(jesus)) C<l true. 

where cxi and refer to a joinability constraint (both sides reduce to the same 
constructor term) and its logical negation, respectively. Our evaluation method 
will evaluate the function superboss, which defines a possibly infinite data, as 
far as needed in order to solve the goal. It starts with superboss(jesus) as 
T, which represents the undefined value, and then superboss(jesus) is eval- 
uated up to [jaimejT], necessary for the goal solving. Moreover, in our frame- 
work, we have to solve lazily the negative constraints. For instance, suppos- 
ing the query superboss(X) superboss(Y) w.r.t. the above program and 

starting with superboss(X) and superboss(Y) as T, the evaluation can bind 
the variables X to jesus and Y to jaime evaluating superboss(jesus) up to 
[jaimejT] and superboss( jaime) up to [antonio|T] where [jaimejT] conflicts 
with [antonio|T] and therefore obtaining the answer X = jesus, Y = jaime. 

As theoretic results of this paper, we will establish the soundness and com- 
pleteness results of our bottom-up evaluation w.r.t. Herbrand models and the 
proof-semantics CRWLF. Moreover, we will establish correspondences among 
proofs of a given goal in the cited logic and the “facts” computed by means of 
the bottom-up evaluation showing the optimality of our method. 

The rest of the paper will be organized as follows. In section 2, we will in- 
troduce CRWLF; section 3 will define the fix point operator and the Herbrand 
models; section 4 will present the magic transformation; section 5 will establish 
soundness, completeness and optimality results and, finally, section 6 will de- 
scribe the conclusions and future work. Due to the lack of space, the full proofs 
of our results can be found in [2]. 
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2 The CRWLF Framework 



In this section we summarize the Constructor Re FFriting Logic with failure 
presented in [14]. Assuming a signature S = DCUFS where DC = IJneiN DC^ 
is a set of constructor symbols c,d,... and FS = IJneiN ^>5'" is a set of function 
symbols f,g, all of them with associated arity and such that DC fl FS = 0, 
and also a countable set V of variable symbols X,Y we write Term for the 
set of (total) terms e,e',... (also called expressions) built up with E and V in the 
usual way, and we distinguish the subset CTerm of (total) constructor terms or 
(total) c-terms s,t, built up only with DC and V. Terms intend to represent 
possibly reducible expressions, whereas c-terms represent data values, not further 
reducible. We extend the signature E by adding two new constants: _L that plays 
the role of undefined value and f that will be used as an explicit representation 
of failure of reduction. The set Term± of partial terms and the set CTerm± 
of partial c-terms are defined in a natural way. Partial c-terms represent the 
result of partially evaluated expressions, and thus they can be considered as 
approximations to the value of expressions. Moreover, we will consider the sets 
Term±,f and GTermx.?- A natural approximation ordering < over CTermx,? 
can be defined as the least partial ordering satisfying: E < t, X < X and 
L(ti, ..., t„) < h{t[, ...,t'„), if ti < t[ for alH G {l,...,n}, h G DCiJFS. The 
intended meaning oit<t' is that t is less defined or has less information than 
t' . Note that the only relations satisfied by f are _L < f and f < f. In particular, 
F is maximal. This is reasonable, since f represents ‘failure of reduction’ and this 
gives a no further refinable information about the result of the evaluation of an 



expression. 

In the context of CRWLF, a program V is a set of conditional rewrite rules 



of the form: 





condition 



where / G FS", and fulfilling the following conditions: (ti, ..., tn) is a linear tuple 
(each variable in it occurs only once) with t\, ...,tn G CTerm; r G Term; C is a 
set of constraints of the form e' t<i e" (joinahility), e' O e" {divergence) , e' ^ e" 
{failure of joinability) or e' <f> e" {failure of divergence) where e',e" G Term^; 
extra variables are not allowed^, i.e. var{r) U var{C) C var{i). The reading of 
the rule is: f{ti, ...,tn) reduces to r if the condition C is satisfied. We will need 
to use c-instances of rules, where a c-instance of a program rule R is defined as 
[.R]j_,f = {Rd I d G CSubstx,p:} with CSubstx,¥ ~ {d :V ^ CTerm±^f}. 

In our framework and due to we allow non-determinism, in general an ex- 
pression can be reduced to an infinite set of values, but we will need some finite 
representation of these sets. For example, we can define the non-deterministic 
function f as: f ^ zero, f — >■ suc{suc{f)). It is easy to see that / can be reduced 



^ The original CRWL framework [7] only considered joinabilities; divergence con- 
straints were incorporated in [12] and failures of both joinabilities and divergences 
are introduced in [14]. 

^ In [7] extra variables are allowed, but the use of function nesting and non- 
deterministic functions can nicely replace them in many cases. 
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Table 1. Rules for CRlTZ/F-provability 



( 1 ) 



e <1 {_L} 



( 2 ) 



X <1 {X} 



X ev 



®1 <1 dl ... e„ <1 C„ ^ r-,^n , I ri -1 

c(ei,...,e„) <1 {c(t) I iG Cl X ... X C„} = ^ DC U {F} 



(4) 

(5) 

(7) 

{fit 

(9) 

( 11 ) 



ei <1 Cl 



e„ <1 C„ 



/(^) <lii 



/(ei, ..., e„) <1 Uijg'Py.,feCi x ... xc„ ^R,t 

r <1 C C 



fit) <Ir 


{-L} 


ei^e‘ 




fit) <Ir 


{F} 


:) ^ r 


ei^e‘ 


e <\ C e 


<1 C 


e M 


e' 


e < C 


e' <3 C' 



(6) 



(8) 



fit) <R C 



f G FS" 

if(t) ^ r- ^ C) G [fl]_i F 



3t G C,t' G C' t 4- t' (10) 



/(ii.....t„) Ck {F} 
so 

e <1 C e' <1 C' 



R = (/(si, ..., Sn) r C), and Si have a 
DC U {F}-clash for some i G {1, ..., n] 



e O e 



3t G C,f' G C' 4 t t' 



Vt G C,t' G C' i yt' (12) g <1 e C \ft e C,t' e C' t f t' \ 
e <0> e I 



to the values zero, suc(suc(zero)), suc(suc(suc(suc(zero))))... We can use the 
undefined value _L to express that the possible reductions of / have the form 
zero or smc(suc(_L)), noted as / <1 {zero, suc(suc(-L))}. This set of values is a 
Sufficient Approximation Set {SAS) for / which provides enough information 
about the values of / to prove that / cannot be reduced to suc(zero). Of course, 
an expression will have multiple SAS’s. Any expression has {_L} as its simplest 
SAS and, for example, the expression / has an infinite number of SAS's: {-L}, 
[zero, smc(suc(_L))}, {zero, suc{suc{zero)) , suc{suc{suc{suc{l.))))} ,. . . 

In CRWLF five kinds of statements can be deduced (assume e G Term±,f)'. 

• e <1 C: C is a SAS for e; 

• e ixi e' (joinability): e and e' can be both reduced to some t G CTerm; 

• e ^ e' (divergence): e and e' can be reduced to some (possibly partial) 
c-terms t and t' having a DC-clash. 

• et/h e': failure of e t<i e'; 

• e </> e!\ failure of e ^ e' . 

where given a set of constructors S, we say that the c-terms t and t' have a 
S-clash if they have different constructors of S at the same position. 

We will use the symbol <(> to refer to any of the constraints ixi, <f>. 

The constraints t/k and cc are called^ the complementary of each other; the same 
holds for <f> and and we write <0> for the complementary of <0>. When proving 
a constraint e<C>e', the calculus CRWLF will evaluate a 5x4 5 for the expressions 
e and e'. These SAS’s will consist of c-terms and provability of the constraint 
effe' depends on certain syntactic (hence decidable) relations between these ones 
defined as follows. 

Definition 1 (Relations over CTerm±,f). 

• t ft' ^def t = t',t G CTerm 
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• t 'I t' -^def t and t' have a DC-clash 

• t yt' -^def t or t' contain f as subterm, or they have a DC-clash 

• y is defined as the least symmetric relation over CTerm±^p satisfying: 

i) X fX, for allX gV 
ii) pyt, for all t G CTerm±^F 

Hi) ifti yt[,...,tn yt'^ then c(ti, y c{ty ■■■,4); for c G L>C” 

Table 1 shows the rules for the CRWLF-calculus. Rule (1) considers {_L} as 
the simplest SAS for any expression. Rules (2) and (3) consider the case of vari- 
ables and constructors. In rule (4), when evaluating a function call /(ei, ..., e„), 
we produce SARs for the arguments. Next, we consider all the possible values t 
of the cross product of these SARs and all the rules i? for /, denoted hy Pf, by 
generating a SAS for each combination: f{t) Cj^ j. The notation indicates 
that only the rule R is used to produce the corresponding SAS. By joining the 
SAS's (, we obtain the final SAS for /(ei, ..., e„). Rules (5) to (8) consider all 
the possible ways in which a rule R can be used to produce a b'Ab' for a call f{t) 
where ti G CTerm±^f. Rule (5) is the trivial case. Rule (6) applies whether there 
exists a c-instance of R with head f{t) and the constraints of C of this c-instance 
are provable. In this case, the SAS will be the one produced by the body of the 
c-instance. Rules (7) and (8) produce the b'Ab' {f} due to a failure of one of the 
constraints (by proving the complementary) and a failure in parameter passing, 
respectively. Finally, the rules (9) to (12) deal with constraints and are easily 
defined by using the relations j,, f, y, y. 

It can be proved that the relations <l,ixi, ^ satisfy some desirable 

properties such as monotonicity or closure under substitutions. On the other 
hand, if e(fe' is provable, then e<)e' is not provable. Full details about CRWLF 
can be found in [14] . 

Finally, a goal t/ is a set of constraints and a solution 9 G CSubst±^f of G 
w.r.t. a CRWLF-prograxn V holds V \~crwlf GO. 



3 Herbrand Models 

In this section we present OR RXF-Her brand algebras and a fix point operator 
for computing the least Herbrand model of a program. We assume the reader 
has familiarity with basic concepts of model theory on functional and logic pro- 
gramming (see [3,7] for more details), but now we point up some notions. 

Given S, a partially ordered set (in short, posef) with a least element bottom 
T (equipped with a partial order <), the set of all totally defined elements 
of S will be noted by Def(S). We write C(b'), X{S) for the sets of cones and 
ideals of S, respectively. The set S =def X{S) denotes the ideal completion of S, 
which is also a poset under the set-inclusion ordering C, and there is a natural 
mapping for each x G S into the principal ideal generated hy x, < x >=def {y G 
S : y < x} G S . Furthermore, 5 is a cpo (i.e. every directed set DCS has a 
least upper bound) whose finite elements are the principal ideals < x >, x G S . 




A Computational Model for Functional Logic Deductive Databases 337 



Definition 2 (Herbrand Algebras). For any given signature S, a Herbrand 
algebra H is an algebraic structure of the form H = {CTerm±^F,{f^}{f^FS}) 
where CTerm±^p is a poset with the approximation ordering < and G 
[CTermj_ p — >■„ CTerm±^F] for f € FS'', where [D — >■„ E] =def {f '■ D ^ 
C{E)\ y u,u' G D : {u < u' ^ f{u) C f(u'))}. Erom the set {/^}{/gfs}; 
we can distinguish the deterministic functions f G FS'^, holding that G 
[CTerm 2 p ~^d CTerm±^F] where [D -Gd E] =def {/ G [D — >■„ E]\ W u G D : 
f{u) GI{E)'\. Einally, the elements of Def{'H) are the elements of CTermF- 

Definition 3 (Herbrand Denotation). Let % be a Herbrand algebra, a val- 
uation over H is any mapping rj : V ^ CTerm±^F, and we say that rj is totally 
defined ijf rj{X) G Def{'H) for all X G V. We denote by ValiFL) the set of all 
valuations, and by DefVal{H.) the set of all totally defined valuations. The eval- 
uation of an e G Term±^F in TL under rj yields |e|^ry G C{CTerm±^F) which is 
defined recursively as follows: 

— 1 -Lp ?7 =def< -L >, =def< F> and \X\'^'q =def< vW >, for X G V. 

— lc(ej, ... , e„)p?7 =de/< c(lej p?7, ... ,le„\'^r]) > for all c G . 

— |/(ej, ... , e„)p?7 =de/ /^(ler P?7, ... ,le„l'^r)), for all f G ES”' . 

Due to non-determinism, the evaluation of an expression yields a cone rather 
than an element. It can be proved that for any e G Term^,? and 77 G Val{FL) 
then G I{CTerm±^f) if is deterministic for every / occurring in e, and 

|e|^?7 G I{CTermf) if e G Termp and rj G DefVal{%). 

Definition 4 (Poset of Herbrand Algebras). We can define a partial order 
over the Herbrand algebras as follows: given A. and B, then A < B iff f-^{ti, 
• • • ) tn) C /®(ti, . . . , tn) for every f G FS^ and ti G CTerm±^F, 1 < i < n. In 
such a way that the Herbrand algebras with this order are a poset with bottom. 

Moreover, it can be proved that the ideal completion of this poset is a cpo, 
called TLALQ, and | [ is continuous in HACQ. 

Definition 5 (Herbrand Models). Given a program V and a Herbrand alge- 
bra FL. We define: 

— FL satisfies a SAS e <i C under a valuation rj (in symbols, {FL,rj) \= e <\C) 

iff Ill'll! G for every t G C. 

— FL satisfies a joinability e to e' under a valuation 77 (in symbols, {FL,if) ^ 
e to e' ) iff there exist t G |e|^77 fl CTerm±^F and t' G [e'l^ry fl CTermj_,F 
such that t ft' . 

— FL satisfies a divergence e O e' under a valuation 77 (in symbols, {FL,ri) ^ 
e ^ e' ) iff there exist t G |e|^77 fl CTermx.F and t' G |e'|^77 fl CTerm±^F 
such that tft'. 

— FL satisfies a failure of joinability e e' under a valuation rj (in symbols, 
{FL,r]) j= e e'J iff for every t G [el^Tyn CTerm±^F and t' G 
CTerm±^F, then ti/t' holds. 
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— % satisfies a failure of divergence e <f> e' under a valuation rj (in symbols, 
{T-L,rj) \= e <f> e' ) iff for every t G [el^ryn CTerm±^p and t' G |e'|^? 7 n 
CTermj_^F, then t ft' holds. 

— % satisfies a rule f{t) r ^ C GV iff 

• every valuation rj such that {TL,ri) |= C verifies \f{t)\'^r] lr |^?7 

• every valuation rj such that, for some i G n}, f and ti have a 

DC U {F}-clash, where li G lsj|^? 7 , verifies fG 1/(s)1^?7 

• every valuation rj such that there exists 6i<C>e' G C such that {'H,ri) ^ 
e*Oe' verifies fG 

— % is a model ofV (in symbols, H \=V) iffH satisfies all the rules in V . 

Definition 6 (Fix Point Operator). Given a Herbrand algebra A, and f G 

FS, we define the fix point operator as: 

Tv{A,f){A =def { lr[|;(' I if there exist fit) — >■ r <G C G P, and g G Val{A) 
such that Si G \ti\f and {A,g) |= C} 

U { F I if there exists /(t)— >rG=CGP, such that for some 
i G n}, Si and ti have a DC U {f} — clash} 

U { F I if there exist f{i) r C G P, and g G Val(A) such 
that Si G \ti\f and (A,g) |= e<0>e' for some eOe' G C} 

U { J- I otherwise} 

Given A G 'HACQ, there exists an unique B G 'HACQ denoted by T-p(Al) 
such that f^(ti, . . . ,tn) = Tp{A, f)(ti, . . . An) for every / G FS'" and ti G 
CTerm±,f, 1 < i < n. The following result which characterizes the least Her- 
brand model can be ensured. 

Theorem 1. The fix point operator T-p is continuous in TLACQ and satisfies: 

1. For every A G HACQ: A^V iffTp{A) < A. 

2. Tp has a least fix point Aip = TLp‘^ where TLp^ is the bottom in TLACQ and 

3. Aip is the least Herbrand model ofV. 

Moreover, we can see that satisfaction in Aip can be characterized in terms of 
'^CRWLF provability: for any constraint (/?, V \~crwlf T iff (A4p,g) ^ tp, for all 
g G DefValiTL). Therefore, \~crwlf is sound and complete w.r.t. the Herbrand 
models, and thus the Herbrand model Aip can be regarded as the intended 
(canonical) model of a program V. 

4 Magic Transformation for Negative Constraints 

In order to present our new magic transformation, first we will show the main 
ideas of the presented one in [1] by using the following example where the pro- 
posed goal has as solution X = s(s(Z)), Y = s(0). 

Original Program 

(1) f(s(Xi),X 2 ) ^ X 2 ^ h(Xi,X 2 ) M 0,p(X2) [XI s(0). 

(2) g(s(X 3 )) ^s(g(X 3 )). 

(3) h(s(X4),s(0)) -> 0. 

(4) p(s(0)) -> s(0). 



CJoal 

: -f(g(X),Y) 1X1 s(0). 




A Computational Model for Functional Logic Deductive Databases 339 



Transformed Program 
Filtered Rules 

(1) f(s(Xi),X 2 ) ->X 2 

(2) g(s(X 3 )) ^s(g(X 3 )) 

(3) h(s(X4),s(0)) -> 0 

(4) p(s(0)) -> s(0) 

Magic Rules 

(5) mg_f’’(mg_g"(X),Y) 

(6) mgJi'’(X5,X6) 

(7) mg^lXs) 

(8) mg_f’’(s(mg_g"(X9)), Xio) - 

(9) mgJi’’(s(mg_g"(Xii)), X 12 ) - 
Goal Solving Rules 

(10) f(mgV(s(Xi3)),Xi4) 

(11) h(mgV(s(Xi5)),Xi6) 



Goal 

: -f(mg_g“(X), Y) txi s(0). 

mg_f'’(s(Xi), X 2 ) Ixi true, h(Xi, X 2 ) M 0, p(X 2 ) M s(0). 
mg_g^(s(X 3 )) M true. 
mg_h^(s(X 4 ), s(0)) \x\ true. 
mg_p^(s(0)) 1X1 true. 



true. 

mg_f’’(s(X5),X6). 
mg_f’’(s(X7),Xs) 
mg-f'’(mg-g"(s(X9)), Xio). 

mg_h’’(mg_g"(s(X„)). Xi2). 



^ h(X7,X8) M 0. 



f(s(mg-g"(Xi3)),Xi4) 

h(s(mg_g"(Xi5)),Xi6) 



^ “g-f (”>g-g" (s (Xi3 ) ) , Xi4 ) 
: mg Ji"” (mg_g" (s(Xi5)),Xi6) 



X true. 
X true. 



The magic transformation transforms every pair {V,Q), where P is a program 
and t/ is a goal, into a pair (jyMQ gMQ-^ such a way that the transformed 
program evaluated by means of the fix point operator, computes solutions 

for which are also solutions of Q w.r.t. the program V. 

Firstly, the so-called passing magic (boolean) functions of the form mg-f^, 
like in logic programming, will activate the evaluation of the functions through 
the fix point operator (see rules (1), (2), (3) and (4) in the transformed pro- 
gram), whenever there exists a call, passing the arguments from head to body 
and conditions of every program rule for them (see rules (6) and (7) in the 
above program obtained from the rule (1) of the original program) by means of 
left-to-right sips. 

Secondly, the transformation process adds magic rules for the outermost func- 
tions (set denoted by outer((P,Q)), which are defined as the leftmost functions 
occurring either in every side of each constraint of the goal, or in the body and 
every side of each constraint of every program rule of the outermost functions, 
or in every side of each constraint of every program rule of functions occurring 
in the scope of an outermost function. In the above example, f, h and p. The 
idea is that whenever a function is in the scope of an outermost function, every 
program rule for the inner function generates a rule, called nesting magic rule, 
for the passing magic function of the outermost one (see rule (8) generated by 
the nesting occurring in the goal). The head and body of every program rule of 
each inner function are “turned around” and introduced as arguments of head 
and body, respectively, of the magic rule for the outermost function, occurring 
at the same position where they appear in the goal, and filling the rest of ar- 
guments with fresh variables. The inner functions are substituted in the magic 
rules by the so-called nesting magic constructors of the form mgj^ , given that 
patterns in program rules must be c-terms. In order to get a lazy evaluation, the 
introduction of these nesting magic rules is only achieved whether the nested 
function is demanded by some of the rules of the nesting function; that is, the 
rule pattern is a c-term or it is variable occurring out of scope of a function in 
the body or in the constraints. Moreover, the same kind of passing magic rules 
must be introduced for each information passing in the rule. For instance, given 
that g becomes an inner function for h due to the information passing from the 
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first argument of f to h in the rule (1), then the nesting magic rule (9) is also 
included. 

Thirdly, every constraint e cxi e' G is transformed into a new constraint 
wherein each inner function is replaced by nesting magic constructors. In the 
above example, f (mg_g“(X), Y) ixi s(0). Moreover, a new rule is generated as seed 
from the goal of the original program (see rule (5)). 

Lastly, whenever there exists a nesting, new rules will include, called goal 
solving rules, which will allow us to get the answer of the new transformed goal 
and they are similar to the nesting magic rules (see rules (10) and (11)). 

In our new magic transformation in order to handle negative information 
according to CRWLF, we have to lazily solve the four kinds of constraints. For 
instance, should be lazily solved, that is, as far as needed up to holds, like 
in the following example. 

Original Program Goal 



(1) f(Xi) ^ s(t(Xi)). 

(2) g(X2) ^ 0. 

(3) t(X 3 ) ^ s(s(0)). 

Transformed Program 
Filtered Rules 

(1) f(Xi) -> s(t(X,)) ^ mg_f'“(X,) 

(2) g(X2) -> 0 <= mg_g‘’(X2) 

(3) t(X 3 ) ->■ s(s(0)) mg_t‘’(X 3 ) 

Magic Rules 

(4) mg.^” ("g-f''(X).mg_g”(X)) ^ 

(5) mg.l^'’ (s(mg_t"(X4)),X6) -> 

(6) mg_ (Xe, 0) -> 

(7) mg_l^'’ (s(s(0)),X9) ->■ 

(8) mg_l^'’ (Xio.Xii) -> 

(9) mg_f'’(Xi2) -*■ 

(10) mg_g’’(Xi6) -> 

(11) mg_t'”(Xi3) -> 



: -f(X) l^g(X). 



Goal 

: -f(X) ^g(X). 

ixi true, 
ixi true. 

IXI true. 

true. 

(mg_f"(X4),X5). 
mg- 1^'’ (X6,mg_g"(X7)). 
mg_ 1^'’ (mg_t"(X 8 ), Xg). 
mg_ (s(Xio), s(Xii)). 
mg.l^'’ (mg_f"(Xi2), X13). 
mg.l^'' (Xi4,mg_g"(Xi5)). 
mg.l^'' (mg_t"(Xi6),Xi7). 



In this example, the function t does not need to be evaluated since the SAS’s 
{s(_L)| and {0} for the functions f and g respectively, are enough to solve the 
goal f (X) ^ g(X) which holds by DC-clash (see rule (11) of CRWLF). 

With this aim we will consider the operators txi, 1 ^ and as they 
were outermost symbols, and the functions occurring in the constraints as they 
were inner symbols It supposes the introduction of nesting magic rules for 
the operators and the functions become magic constructors inside of these new 
nesting magic rules. In the above example, the new seed is the rule (4) which 
nests both expressions occurring in the goal constraint and thus the nesting 
magic rules (5) and (6) are generated. 

Moreover, the magic transformation analyzes the body of each definition rule 
of the nested functions f and g by detecting, in the case of f , a constructor in the 
body, and thus it generates the rule (8). This rule indicates that if the evaluation 
of both hand-sides of the constraint generates the same outermost constructor, 
then the bottom-up evaluation has to continue. In the general case, magic rules 
of this kind for a given constraint will be added for every constructor out of the 



Remark that it does not mean that the notion of outermost function will be modified 
in the rest of the paper. 
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scope of a function either occurring in the constraint or in the body of some rule 
of the leftmost functions of the constraint. Finally, the rules ( 9 ), ( 10 ) and ( 11 ) 
will allow to recover the passing magic functions. In the general case, one of these 
will be added for each outermost function symbol. The bottom-up evaluation of 
this program is as follows: 

© WpATS = -L 

0 (”>g-f“(X),mg_g"(X)) <1 {true},...}. 

© '^%Mg = {“g-F(X) <1 {true},mg_l^'’ (s(mg_t"(X)), mg_g"(X)) O {true} , mg_g‘’(X) <1 {true}, 
mg_ 1 ^'“ (mg_f"(X), 0) <1 {true}, . . .}. 

© = {f(X) <1 {s(-L)},g(X) <1 {0},mg_l^ (s(mg_t"(X)),0) <1 {true},...}. 

By following with the new magic transformation, this one is not the only 
modification w.r.t. [1]. Negative constraints can be satisfied not only by DC- 
clash but also when a failure value for functions appears. Failures of reduction 
can appear in the following cases: (a) failure of the condition of a rule, by rule 
(7) of CRWLF, or (b) failure in the parameter passing, by rule (8) of CRWLF. 
In the case (a), for the outermost functions, we have the same problems as logic 
programming, and the magic transformation must be modified in order to avoid 
that some magic functions cannot be evaluated to true due to the information 
passing including constraints with undefined functions. Our transformation will 
ensure that the magic functions are two- valued, that is, their SAS’s will include, 
at least, true or f, like in the following example: 



Original Program 

(1) f(Xi,X 2 ) -> X 2 

(2) h -> h. 

(3) p(0) -> s(s(0)). 

(4) p(s(0)) -> s(0). 

Transformed Program 
FilteredRules 

(1) f(Xi,X 2 ) 

(2) h 

(3) p(0) 

(4) p(s(0)) 

Magic Rules 



h [X] 0, p(X 2 ) M s(0). 



Goal 

: -f(X,Y) 1^ s(0). 



Goal 

: -f(X,Y) 1^ s(0). 

- X 2 Rig-f'^ (Xi , X 2 ) M true, h M 0, p(X 2 ) [x] s(0). 

- h nigji'’ [XI true. 

■ s(s(0)) mg_p’’(0) [X] true. 

■ s( 0 ) mg_p’’(s( 0 )) M true. 



(5) 

( 6 ) 
(D 
( 8 ) 

(9) 

( 10 ) 
( 11 ) 
( 12 ) 

(13) 

(14) 



b(0)) 



(mg_f"(X,Y),£ 

,g_ l^*' (X4, Xs) 

•g_ IX]'’ (rngJi®, Xe) 

.g-lxj’’ (s(s( 0 )),X 7 ) 

,g_lx]'’ (s(0),Xs) 

mg_f'’(X9,Xio) 

mg.h’’ 

”>g-p’’(Xi3) 

mg_ [X]'’ (mg_h" , 0) 

mg_ IX]'’ (mg_p"(Xi8), s(0)) 



: h [X] 0, p(X 4 ) M s(0). 



true. 

mg_l^'' (mg_f"(X3,X4),X5) 
mg_ m'’ (mgJi“, Xe). 
mg_M'’ (mg_p"(0),X7). 
mg.M'’ (mg_p"(s(0)),X8). 
mg.l^'' (mg_f"(X9,Xio),Xii). 
mg_ m'’ (mgji", X 12 ). 
mg_ix'’ (mg_p“(Xi 3 ), X 14 ). 
mgj'’(Xi5,Xi6). 

■ mgj'’(Xi7,X,8). 

In the previous example, the proposed goal has as answers Y = 0 and Y = 
s(s(Z)), due to the failure in the instances p(0) txi s(0) and p(s(s(Z))) ixi s(0) of 
the conditions of f. 



The idea consists in detecting the outermost functions in negative constraints, 
and for each definition rule of these functions, to avoid the left-to-right sips (see 
rules ( 13 ) and ( 14 ) obtained from the rule ( 1 ) of the original program). 

By considering the left-to-right sips presented in [1], the rule ( 14 ) would be 
replaced by the rule mg_ ixi'’ (mg_p“(Xi8), s(0)) — >• mg_f'’(Xi 7 , Xps) 4= h cxi 0 taking 
into account the constraint h txi 0 for the information passing to p(Xis). Given 
that h has as unique b'Ab' {T}, then the b'Ab' mg_p'’(0) <l {true} cannot be 
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generated by using the previous rule and the rule (12). Therefore, the bottom- 
up evaluation could not obtain the b'Ab' p(0) <l {s(s(0))} which allows to get 
the SAS f(X, 0) <1 {f} which can be used to satisfy the goal. In this way, the 
evaluation of the program could not generate the answer Y = 0, and thus the 
use of left-to-right sips produces uncompleteness w.r.t. CRWLF. 

In the case (b), for the outermost functions, the failure of reduction will be 
computed by means of the fix-point operator (see definition 6). In the above 
example, the evaluation would compute the b'Ab' p(s(0)) <1 {f} ^ by the rule 



(4), where s(0) =def {0, s(s(Z)), f} allowing to generate the additional answer 
Y = s(s(Z)) by applying the rule (1) of the transformed program. The bottom-up 
evaluation for the above program is as follows: 

© = -L 

© (mg-f“(XA), s(0)) <1 {true},p(0) <1 {F},p(s(0)) <1 {F},p(0) <1 {F, J_}, 

p(s(0)) <3 {F.±},...}. 

© 'H’Lmq = {mg-fTx, Y) <1 {true}, f(X, s(s(Z))) <3 {F}, . . .}. 

© = {mg-x'’ (mg-li“,0) <3 {true} , mg_ m' (mg_p”(Y), s( 0)) <1 {true},...}. 

© = {“g-h’’ <1 {true},mg_p‘’(Y) <1 {true} , mg_ m'’ (s(s(0)), s( 0)) <1 {true}, 

mg_ [x]^ (s(0), s(0)) <3 {true}, . . .}. 

© '^pMg = 1p(0) <1 {F. s(s(0))},p(s(0)) <1 {F, s(0)}, . . .}. 

© 'H^Mg = {f(X,0) <1 {F},. ..}. 

and the instances of the goal f (X, 0) t/h s(0), f (X, s(s(Z))) s(0) are satisfied in 

the Herbrand algebra. 

However, it is not enough, given that the failures of reduction can also appear 
at the same cases due to inner functions, that is due to failure (c) in the condition 
and (d) in parameter passing of a rule, like in the following example. 

Original Program Goal 

(1) f(Xi) ^s(x,). : -f(g(X)) 1^ s(0). 

(2) g(s(0)) -> 0 ^ p(0) IXI 0. 

(3) p(0) -> s(s(0)). 

Transformed Program Goal 



FilteredRules : — f(mg_g^(X)) s(0). 

( 1 ) f(Xi) — >■ s(Xi) ^ mg_f^(Xi) \x\ true. 

(2) g(s(0)) ^0 mg_g^(s(0)) M true,p(0) ix] 0. 

(3) p(0) s(s(0)) mg_p^(0) 1X1 true. 



Magic Rules 

(4) mg_^'’ (mg_f'^(mg_g«(X)),s(0)) 

(5) mg.l^'’ (s(X 2 ),X 3 ) 

( 6 ) mg_ IX|’“ (s(s(0)), X 4 ) 

(7) mg.lxl'’ (mg_p"(0),0) 

( 8 ) mg_l^'’ (mg_f'*(0),X5) 

(9) mg^l^” (mgj'*(F),X 6 ) 

(10) mg_ (mg_l"(F),X 7 ) 

(11) mg_f'’(X8) 

(12) mg_p'”(Xio) 



true. 

mg_ (mg_f"(X2), X3). 

mg_ m’’ (mg_p"(0),X4). 

mg-f''(mg-g"(s(0))). 

mg_ (mg_f"(mg_g"(s(0))),X5) 

mg- (mg-l”(mg-g"(0), Xe). 

mg_^'“ (mg_f"(mg_g"(s(s(Z))),X7). 

mg_ (mg_f"(X8), X9). 

mg_ m’’ (mg_p"(Xio), Xii). 



p(0) M 0. 



Goal Solving Rules 

(13) f(mg_g"(s(0))) -> f(0) mg_f’’(mg_g"(s(0))) M true. 

(14) f(mg_g"(0)) -> f(F) mg_f’’(mg_g"(0)) X true. 

(15) f(mg_g"(s(s(Z)))) -> f(F) mg_f’’(mg_g"(s(s(Z)))) ixl true. 

As you can see, we have the answers X = 0, X = s(0) and X = s(s(Z)) for the 
proposed goal. In this example, the failure of reduction in g(X) which is nested by 
f, is due to the reasons (c): failure in the condition of g (p(0) [xi 0) and (d): failure 
in the parameter passing (g(0) and g(s(s(Z))) w.r.t. the head g(s(0)) of the rule 



^ i denotes the set {t' G CTerm^ p \ t and t' have a DC U {F}-clash}. 
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(2)). The case (c) is handled by the magic rules for the outermost functions since 
the condition of the rules for the inner functions will appear as condition of these 
magic rules (see rule (8)). This failure generates mg_ ijh, (mg_f“(0), s(0)) <l {f} by 
applying the rule (8) which allows to obtain the answer X = s(0) by applying the 
rules (11), (1) and (13), respectively. The case (d) is handled by introducing 
rules for representing the failure of parameter passing of the nested functions. 
For instance the rules (9) and (10) are introduced by the rule (2) in order to 
obtain f(mg_g”(0)) <l {s(f)| and f (mg_g”(s(s(Z)))) <l {s(f)|, by using also (11) 
and (1) and, finally (14) and (15). 

Next, we present an algorithm Magic_Alg for this transformation. It uses an 
auxiliary algorithm Nesting in order to generate nesting magic and goal solving 
rules. They are shown in tables 2 and 3, wherein 

— i\i represents the subterm of i at position i] e[e']i represents e replacing the 
subexpressions at position i by e'; safe((p) represents the subexpressions of 
(p out of the scope of a function. 

— is defined as =def X, c(e) ^ =def c(e^) and /(e)^ =def /(e"^9 ") an d 
A™®" =def X, c(e)™9'^ =def c(e'"9") and /(^9^ =def m5_/^(e™9"); 
and is defined as X^ =def X, c{e)^ =def c(e^) and /(e)^ =def 

— The functions find are auxiliary ones added whenever / has nested construc- 
tors in the patterns of its rules. 

— e represents the sequence of expressions to be considered; h{t) is the head 
of a program rule; / is a function, representing that e is in the scope of 
/; the boolean Nested? is true whenever the parameter / has been input; 
Mg represents the computed set of magic rules; Pg represents the computed 
set of program rules; ind indicates the position of e in the scope of /; G 
represents a set of triples (/, g, i) whose meaning is that the function g is 
nested by / at position i; pos.op indicates the position of e in a constraint, 
which can be either left (e<C>e') or right (e'<(>e) hand-side (it can take two 
values, 1 and 2); and op represents the operator which is being considered. 

The algorithm is applied as follows where denotes arguments not needed for 
the calling: 

Mg ■.= 0; Pg := P; G ■.= &■, C := 0; 

for every e^e' G Q do 

if ((<^ — ^) or (-O' —<t>y) then Mg Mg U {^(e, e')^ true}; 

else Mg := Mg U {0(e, e')^ true C^}; 

endif 

Magic_Alg(e, false, Mg, Pg, ind, G, 1 ,<)■, ((-O' =1?^) or (-O' =<f>))); 

Magic_Alg(e^, false, Mg, Pg, ind, G, 2, -O', ((O or (-O' =<f>))); 

G GU{e<^e'} 

endfor 

Once the algorithm has been applied, consists of Pf U Mg, where 

(/(^ — >• r 4= C)^ is of the form f{t) — >■ r 4= mg-f^{t) txi true,C^ , and 
consists of DC U and FS U where and FS^^ denote the 

set of nesting magic constructors and passing magic functions, respectively. Q-^'^ 
consists in the set of constraints where e<(>e' G Q. 
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5 Soundness, Completeness and Optimality Results 

Our first result about our evaluation method establishes the equivalence among 
both the original and the transformed program w.r.t. the given goal. 

Theorem 2 (Soundness and Completeness). 

'P '^CRWLF GO \~CRWLF G^O. 

The second result ensures optimality of our bottom-up evaluation method, in 
the sense that every computed SAS corresponds either with a subproof of some 
solution for the goal or with a function call from the goal. 

We denote by neg{P, G) the subset of outer{P, G) containing the functions 
occurring in negative constraints. 

Theorem 3 (Optimality). 

_ jj -pMQ f(e)^ <\C, C ^ {-L}> then / G outer{V ,G) and either: 

• there exist 9 and a proof V \~crwlf GO of minimal size and a subproof 
of the form V \~crwlf /(e) <1 C, or 

• there exist ^crwlf g{a')^ <1 C , a program rule instance g{t) 

r 4= C, e<C>e', ... G [R]±,f, and proofs V \~crwlf e' <1 C where U G Ci, 
and either a proof of minimal size V \-crwlf e <\ Ce or a proof of 
minimal size V \~crwlf e' <1 Ce> , containing a subproof V \~crwlf 
/(e) <1 C, and if g ^ neg{V,G) then there exists a proof V \~crwlf C. 

- If ^CRWLF fk{e)^ <\ C, then there exists a proof p-^^ ':~crwlf 

f{e') <\ C containing subproofs p-^'^ '^crwlf e* <1 C for some Ci. 

- IfP-^^ ^CRWLF <C>(e, e')^ <1 Co, then there exist proofs p-^^ \~crwlf <1 
C for some C and p-^^ '^crwlf < C for some C . 

- If P-^^ ^CRWLF f{e)^ <1 Co, then there exists a proof p-^^ 'fcrwlf 
/(e)^ <1 C for some C. 



6 Conclusions and Future Work 

In this paper, we have presented a goal-directed bottom-up evaluation for functi- 
onal-logic programs with negative information. By adopting the CRWLF-sevii&n- 
tics [14] for our language, we have defined Herbrand models for this seman- 
tics and a fix-point operator which computes the Herbrand model of CRWLF- 
programs. Moreover, we have modified the magic transformation presented in 
[1] in order to handle negative constraints avoiding the known problems related 
to the magic transformations. As future work we go toward the incorporation 
of grouping and aggregation operators in our language, as well as the investi- 
gation of an extension of the relational algebra for it. Also, we are starting the 
implementation of this language, which will be based on the use of traditional 
indexing techniques. 
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Table 2. Magic Algorithm 



Magic_Alg(in e : tuple(ExprGssion); in h(t) : Expression; in f : FunctionSymbol; 
in Nested? : Bool; in/out Mg : Program; in/out Pg : Program; in/out ind : Index; 
in/out G : set(tuple(FunctionSymbol, FunctionSymbol, Index)); in pos.op : Index; 
in op : Operator; in Failure? : Bool) 
var : Constraint; 
if Nested? then 
ind := 0; 

for every ej, . . . , e^ do 
case Gi of 

X, X G V : 
if (t|j = X) then 
for every ((h, k, j) G G) do 
G GU {(find,k,i)}; 

Nesting (k, f , Mg, Pg, ind, i, G, pos_op, op. Failure?) ; 
endfor; 
endif ; 

c(e'), c G : 

for every find(t) ^ r C G Pg do 

if (ti = c(lT') and not (lT' = X and X Pi (safe(r) U safe(C)) — 0)) then 
Pg := Pg U {find+i(t[t']i) ^_r_^ c", fiodCXlclV)]!)" -> f i„d+i (X[V]i)"} ; 

Mg := Mg U {op(Yi,Y 2 [fi„d+i(X[V]i)]p„,_„p)'’ ^ op(Yi, Y 2 [fidd(X[c(V)]i)]p„,_„p)'’}; 

Mg := Mg U {fi„d+l(Y)’’ ^ op(Zi,Z2[fi„d+l(Y)]p„s_op)'’}; 

endif ; 

if (ti G V and ti G safe(r) U safe(C)) then 

Pg := Pg U {fi„d+i(t) -*■ r 4= C", fi„d(X[c(V)]i)" -> fi„d+i(X[V]i_)"}; 

Mg := Mg U {op(Yi,Y_ 2 [fi„d+i(X[V]i)]p„,_„p)'’ ^ op(Yi, Y 2 [fidd(X[c(V)]i)]p„,_„p)'’}; 

Mg := Mg U {fi„d+l(Y)’’ -> op(Zi,Z 2 [fi„d+i(Y)]p„s_„p)'’}; 
endif; 
endfor; 
ind := ind -f 1; 

Magic_Alg(e', h(t), f , true. Mg, Pg, ind, G, pos_op, op, Failure?); 
k(e"^), k G FS : 
if ((find,k, i) ^ G) then 

G G U {(findi k, i)}; Nesting(k, f , Mg, Pg, ind, i, G, pos.op, op. Failure?); 
]VIagic_Alg(mg_k^(G'), h(t), f , true, Mg, Pg, ind, G, pos.op, op. Failure?); 
endif ; 
endcase; 
endfor; 
else 

for every gi, . . . , Gn do 
case Gi of 

c(e"0, c G DC” : 

Mg :=MgU {op_(Xj,Y,)'' ^ op(c(X), c(Y))L 1 < j < m}; 

Magic_Alg(e', h(t) ,—, false, Mg, Pg, ind, G, pos.op, op, Failure?) ; 
k(e"0, k G FS : 

Mg := Mg U {k(_Y)” ^ op(Xi,X 2 [k(Y)]p„,_„p)'“}; 

Magic_Alg(e', h(t), k, true. Mg, Pg, ind, G, pos.op, op. Failure?); 
for every k(s) ^ r C G Pg do 
Mg := Mg U {op(Xi,_X 2 [r]p„._„p)'’ -> op(Xi , X 2 [k(s)]p<,,_„p)‘‘ ^ c"}; 

]VIagic_Alg(r, k(s), — , false. Mg, Pg, ind, G, pos_op, op. Failure?); 

C' 0; 

for every e^e^ G C do 

if Failure? then Mg := Mg U {0(®: k(s)^}; 

else Mg Mg U k(s)^ 

endif 

]VIagic_Alg(e, k(s), — , false. Mg, Pg, ind, G, 1, ((<^ =^) or (0 

]V[agic_Alg(e^, k(s), — , false. Mg, Pg, ind, G, 2, ((^ =1?^) or (<^ =<5^>))); 

C' C' U {e<>e'}; 

endfor; 

endfor; 

endcase; 

endfor; 

endif; 
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Table 3. Nesting Transformation 



Nesting(in k : FunctionSymbol; in f : FunctionSymbol; in/out Mg : Program; in/out Pg : Program; 
in/out ind : Index; in i : Index; in/out G : set(tuple(FunctionSymbol, FunctionSymbol, Index)); 
in pos.op : Index; in op : Operator; in Failure? : Bool) 
var C" : Constraint; 
for every find(t) ^ r C G Pg do 
if (ti ^ V) or ((ti G V) and (ti G safe(r) U safe(C))) then 
for every k(s) ^ r^ G Pg do 

Mg := Mg U {op(Yi,Y 2 [fi„d(X[r']i)]p„,_„p)‘‘ -> op(Yi, Y 2 [fidd(_X[k(s)]i)]p„=_„p)'’ ^ C'}; 

Mg :=MgU{op(Yi,Y 2 [fidd(X[F]i)] 

pos_op f -> op(Yi,Y 2 [fidd(X[k(i)]i)]p„,_„p)'’ ^ c'}; 

Pg := Pg U {find(X[k(s)]i)" ^ find(X[r']i)" ^ C'}; 

Pg := Pg u {fi„d(x[k(i)]i)" ^ fiddixlFli)" ^ c'}; 

]VIagic_Alg(r^ , k(s), true, f , Mg, Pg, ind, G, pos.op, op, Failure?); 

C" := 0 ; 

for every e^e^ G C' do 

if Failure? then Mg := Mg U {<0>(find(X[e]i), fi„d(X[e']i))'’ -> fi„d(X[k(s)]i)'’}; 

else Mg := Mg U {0(fi„d(X[e]i), fi„d(X[e']i))‘‘ -> f i„d(X[k(s)]i)'” ^ c""}; 

endif ; 

Magic_Alg(e, f ind(X[k(s)]i), — , false, Mg, Pg, ind, G, 1, Ot ((0 or (0 =<t>)))\ 

Magic_Alg(e^, find(X[k(s)]i), — , false. Mg, Pg, ind, G, 2, -O', ((‘O' =1?^) or (O =</>))); 

C" := Z” U {e<C>e'}; 

endfor; 

endif; 

endfor; 

endif; 

endfor; 
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Abstract. In this paper we present a logic programming based frame- 
work for the integration of possibly inconsistent databases. In particular 
we consider the problem of ‘merging’ databases and, since the resulting 
‘merged’ database may be inconsistent (with respect to the constraints 
defined on the input databases or with respect to additional constraints), 
we address the problem of managing inconsistent databases. We propose 
a general logic framework for computing repairs and consistent answers 
over inconsistent databases. The logic framework and the techniques for 
computing repairs and consistent answers proposed here are more general 
than previously proposed techniques. Indeed, our technique is sound and 
complete for universally quantified constraints whereas previous defined 
techniques only consider restricted cases. 



1 Introduction 

The aim of data integration is to provide a uniform integrated access to multiple 
heterogeneous information sources, which were designed independently for au- 
tonomous applications and whose contents are strictly related. The integration 
of knowledge from multiple sources is an important aspect in several areas such 
as data warehousing, database integration, automated reasoning systems, active 
reactive databases and others. However, the database obtained from the merg- 
ing of different sources could contain inconsistent data. The following example 
shows a typical case of inconsistency. 

Example 1 . Gonsider the database consisting of the single binary relation 
Teaches(Course, Professor) where the attribute Course is a key for the rela- 
tion. Assume there are two different instances for the relations Teaches: D\ = 
{Teaches{ci,pi) ,Teaches{c2,P2)} and D2 = {Teaches{ci,p\) ,Teaches{c2TP^)} ■ 
The two instances satisfy the constraint that Course is a key but, from their 
union we derive a relation which does not satisfy the constraint since there are 
two distinct tuples with the same value for the attribute Course. □ 
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In the integration of two conflicting databases simple solutions could be based 
on the definition of preference criteria such as a partial order on the source 
information or majority criteria [18]. However, these solutions are not generally 
satisfactory and more useful solutions are those based on 1) the computation of 
‘repairs’ for the database, 2) the computation of consistent answers [2]. 

The computation of repairs is based on the insertion and deletion of tuples 
so that the resulting database satisfies all constraints, whereas the computation 
of consistent answers is based on the identification of tuples satisfying integrity 
constraints and on the selection of tuples matching the goal. For instance, for 
the integrated database of Example 1, we have two alternative repairs consisting 
in the deletion of one of the tuples (c2,P2) and (02,^3). The consistent answer 
to a query over the relation Teaches contains the unique tuple (ci,pi) so that 
we don’t know exactly which professor teaches course C2. 

In this paper, we focus our attention on the integration of conflicting instances 
[1,2,7] related to the same concept and possibly coming from different sources. 
We introduce two operators, called Merge operator and Prioritized Merge oper- 
ator, which allow us to combine data coming from different sources. Moreover, 
since the resulting ‘merged’ database may be inconsistent (with respect to the 
constraints defined on the input databases or with respect to additional con- 
straints), we address the problem of managing inconsistent databases. 

We propose a general logic framework for computing repairs and consistent 
answers over inconsistent databases. Our technique is based on the rewriting of 
the different types of constraints into (prioritized) extended disjunctive rules with 
two different forms of negation: negation as failure and classical negation. The 
disjunctive program can be used both to generate ‘repairs’ for the database, i.e. 
minimal sets of insert and delete operations which make the database consistent, 
and to produce consistent answers, i.e. maximal sets of atoms which do not 
violate the constraints. Our technique is more general than techniques previously 
proposed, and it is sound and complete as each preferred stable model defines a 
repair and each repair is derived from a preferred stable model. 

Recently there have been several proposals considering the problem of man- 
aging inconsistent databases. The problem has been deeply investigated mainly 
in the areas of databases and artificial intelligence [1,2,4,7,18,19,21]. A technique 
based on the rewriting of (single head) integrity constraints into logic rules has 
been proposed in [12], whereas [3] proposed a technique for the rewriting of 
binary constraints into logic (non disjunctive) rules. 



2 Background 

We assume familiarity with disjunctive logic program and disjunctive deductive 
databases [8,9] and recall here non standard definitions used in the paper. 

Extended disjunctive databases. Extended Datalog programs extend stan- 
dard Datalog programs with a different form of negation, known as classical or 
strong negation, which can also appear in the head of rules [9,11,15]. An ex- 
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tended atom is either an atom, say A or its negation -■ A. An extended Datalog 
program is a set of rules of the form 

Aq V ... V Afc ^ Bi, ..., Bjn, not Bm+i , ..., not Bn k + n > 0 

where Aq, ..., Ak, B\, Bn are extended atoms. A 2- valued interpretation / for 
an extended program P is a pair (T, F) where T and F define a partition of 
B'pU-'B'p and -'B-p = {~'A|A € Bp (the Herbrand base of P)}. An interpretation 
I = (T, F) is consistent if there is no atom A such that A G T and -•A G T. 
The semantics of an extended program V is defined by considering each negated 
predicate symbol, say -<p, as a new symbol syntactically different from p and by 
adding to the program, for each predicate symbol p with arity n, the constraint 
^ p(Ai, ..., A„), -ip(Ai, ..., X„). In the following, for the sake of simplicity, we 
shall also use rules whose bodies may contain disjunctions. 

Queries. Predicate symbols are partitioned into two distinct sets: base predicates 
and derived predicates. Base predicates correspond to database relations and they 
do not appear in the head of any rule, whereas derived predicates are defined 
by means of rules. Given a database D, a predicate symbol r and a program P, 
D{r) denotes the set of r-tuples in D whereas Pp denotes the program derived 
from the union of P with the tuples in D, i.e. Pp = P U {r(f) ^ | t G D{r)}. 
In the following a tuple t of a relation r will also be denoted as a fact r{t). The 
semantics of Pp is given by the set of its stable models by considering either 
their union {possible semantics or brave reasoning) or their intersection {certain 
semantics or cautious reasoning) . A query Q is a pair {g, P) where g is a predicate 
symbol, called the query goal, and P is & program. The application of a query 
(5 to a database D will be denoted by Q{D). The answer to a query Q = {g, P) 
over a database D, under the possible (resp. certain) semantics, is given by D'{g) 
where D' = UmsSmCPd) ^ = ^m^sm{Po) ^)- 

Integrity constraints. Database schemata contain the knowledge on the struc- 
ture of data, i.e. they define constraints on the form the data must have. Integrity 
constraints express semantic information on data, that is relationships that must 
hold among data in the theory and they are mainly used to validate database 
transactions. They are usually defined by first order rules or by means of special 
notations for particular classes such as keys and functional dependencies. 

Definition 1. An integrity constraint (or embedded dependency) is a formula 
of the first order predicate calculus of the form: (VA) [ d>{X) D {3Z)F{Y) ] 
where ^ and F are two conjunctions of literals, X and Y are the distinct sets 
of variables appearing in <P and F respectively, and Z = X — Y is the set of 
variables existentially quantified. □ 

In the above definition the conjunction F is called the body and the conjunction 
F the head of the integrity constraint. Most of the dependencies developed in 
database theory are restricted cases of some of the above classes. For instance, 
functional dependencies are positive, full, single-head, unirelational, equality- 
generating constraints [14]. 
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In the rest of the paper we concentrate on full (or universal) single-head 
constraints, where if' is a literal or a conjunction of built-in literals (i.e. com- 
parison operators). Therefore, an integrity constraint is a formula of the form: 
(VX) [ i?i A - • ■ /\Bn/\not Ai/\- ■ ■ Anot A Aq ] where Ai,..., Am, 

are base positive literals, </> is a conjunction of built-in literals, Aq is a base pos- 
itive atom or a conjunction of built-in atoms, X denotes the list of all variables 
appearing in Bi,...,Bn, moreover the variables appearing in Ao,...,Am and (), 
also appear in B\, Bn- Often we shall write our constraints in a different 
format by moving literals from the head to the body and vice versa. For in- 
stance, the above constraint could be rewritten as (VX) [ B\ A ■ ■ ■ A Bn A (j) A 
Aq\/ Ax'd ■■■ y Am ] or in the form of a rule with empty head, called denial: 
(yx) [ Bi A ■ ■ ■ A Bn A not Ag A not A^ A • • • A not Am A(j) A ] which is satisfied 
only if the body is false. 

3 Database Integration 

Integrating data from different sources consists of two main steps: the first in 
which the various relations are merged together and the second in which some 
tuples are removed (or inserted) from the resulting database in order to ensure 
some integrity constraints. Before formally introducing the database integration 
problem let us recall some basic definitions and notations. 

Let i? be a relation name, then we denote by: i) attr(R) the set of attributes 
of R, ii) key(R) the set of attributes in the primary key of R, iii) fd{R) the set of 
functional dependencies of R, and iv) inst(R) the instance of R (set of tuples). 
Given a tuple t G inst{R), key{t) denotes the values of the key attributes of t 
whereas, for a given database D, fd{D) denotes the set of functional dependen- 
cies of D and inst(D) denotes the database instance. 

We assume that relations associated with the same concept have been ho- 
mogenized with respect to a common ontology, so that attributes denoting the 
same concepts have the same name [23] . We say that two homogenized relations 
R and S, associated with the same concept, are overlapping if key{R) = key{S). 

The database integration problem is as follows: given two databases Di = 
{i?i, ..., Rk} and D 2 = ..., Sk} where Ri and Si refer to the same concept and 

a set of integrity constraints IC, computes a database D = {Ti, ...,Tfe}, where 
each Ti, derived from Ri and Si is such that inst{D) ^ XC. It is important 
to note that XC denotes a set of constraints which must be satisfied by the 
integrated database and it may be different and more general than constraints 
defined on the input databases. 

Since the integrated database D does not generally satisfy all integrity con- 
straints in XC, we first compute a database D' by ‘merging’ the input databases 
and then compute repairs which make the database consistent. Alternatively, 
we leave the integrated database inconsistent and compute answers by selecting 
only tuples satisfying the constraints. Therefore, in the integration of databases 
we deal with three different problems: 1) merging of the input databases, 2) com- 
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puting repairs for the possible inconsistent merged database, and 3) computing 
consistent answers to queries. These problems will next be addressed. 

Database merging. The first step in the integration of two databases Di = 
{Ri , ..., Rk} and D2 = {S'!, ..., Sk} is the merging of two different data sources; 
this can be done by means of a merging operator applied to D\ and Z?2. The 
resulting database D = {Ti, ...,Tfc} is obtained by merging relations associated 
with the same concept, that is Ti = Ri o Si ion 1 < i < k. 

Definition 2. Given two, not necessarily distinct, relations R and S such that 
attr{R) C attr{S) and two tuples t\ € inst{R) and t2 G inst{S), we say that t\ is 
less informative than t2 {t\ <C ^2) if for each attribute a in attr{R), ti[A\ = t2[A\ 
or ti[A\ =T, where T denotes the null value. Moreover, given two relations R 
and S, we say that R<^ S if Vti G inst{R) 3 t 2 G inst{S) s.t. t\ <C ^2- n 

Definition 3. Let R and S be two relations, a binary operator o such that: 

1. attr{Ro S) = attr{R) U attr{S), 

2. Rt<S<.RoS. 

is called merge operator. Moreover, it is said to be 

— lossless, if for all inst{R) and inst{S), i? <C {Ro S) and S' <C (i?o S); 

— dependency preserving, if for all inst{R) and inst{S), is {Ro S) ^ {fd{R) fl 

fd{S)). □ 

Observe that if a merge operator is dependency preserving, the resulting relation 
is consistent with respect to the functional dependencies defined on the input 
relations; this happens because the key constraints belong to fd{R)r\fd{S). For 
this reason, in the following, we only consider lossless operators and introduce 
preferences (i.e. a partial order) according to the functional dependencies of the 
source databases which are preserved in the merged database. 

Definition 4. Given two lossless merge operators Oi and 02, we say that i) Oi 
is content preferable to 02 (oi 02) if, for all R and S, |i?Oi S\ < |i?<>2 S'], and 
ii) Oi is dependency preferable to 02 (oi ^fd 02) if, for all R and S, the number 
of tuples in (i?Oi S) which violate the fd{R) fl fd{S) are less than the number 

of tuples in {R02 S) which violate the fd{R) fl fd{S). □ 

Repairing inconsistent databases. Once the logical conflicts due to the 
schema heterogeneity have been resolved, conflicts may arise, during the in- 
tegration process, among instances provided by different sources. In particular, 
the same real-world object may correspond to many tuples (possibly residing 
in different overlapping relations), that may have the same value for the key 
attributes but different values for some non-key attribute. 

Let us first introduce the formal definition of consistent database and repairs. 

Definition 5. Given a database D and a set of integrity constraint IC on D, 
we say that D is consistent if D |= IC, i.e. if all integrity constraints in IC are 

satisfied by D, otherwise it is inconsistent. □ 
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Definition 6. Given a (possibly inconsistent) database D, a repair for D is a, 
pair of sets of atoms (R^,R~) such that 1) = 0, 2) — R~ \= TC 

and 3) there is no pair (S'+,5'“) yf such that C i?+, S~ C R~ 

and D\J S'^ — S~ \=XC. The database D U — R~ will be called the repaired 
database. □ 

Thus, repaired databases are consistent databases which are derived from the 
source database by means of a minimal set of insertion and deletion of tuples. 
Given a repair R, denotes the set of tuples which will be added to the 
database, whereas R~ denotes the set of tuples of D which will be deleted. In 
the following, for a given repair R and a database D, R{D) = D U — R~ 
denotes the application of i? to D. 

Example 2. Assume we are given a database D = {p{a),p{h), q{a), q{c)} with the 
inclusion dependency (VA) [ p{X) D q{X) ]. II is inconsistent since p{b) D q{b) 
is not satisfied. The repairs for D are R\ = ({( 7 ( 6 )}, 0) and R 2 = (0, {p{b)}) pro- 
ducing, respectively, the repaired databases R\{D) = {p{a),p{b),q{a),q{c),q{b)} 
and R 2 {D) = {p{a),q{a),q{c)}. □ 

Querying inconsistent databases. A (relational) query over a database de- 
fines a function from the database to a relation. It can be expressed by means of 
alternative equivalent languages such as relational algebra, ‘safe’ relational cal- 
culus or ‘safe’ non recursive Datalog [22]. In the following we shall use Datalog. 
Thus, a query is a pair ( 77 , V) where V is a safe non-recursive Datalog program 
and (7 is a predicate symbol specifying the output (derived) relation. Observe 
that relational queries define a restricted case of disjunctive queries. The reason 
for considering relational and disjunctive queries is that, as we shall show next, 
relational queries over databases with constraints can be rewritten into extended 
disjunctive queries over databases without constraints. 

Definition 7. Given a database D and a set of integrity constraints XC, an atom 
A is true (resp. false) with respect to XC if A belongs to all repaired databases 
(resp. there is no repaired database containing A). The set of atoms which are 
neither true nor false is undefined. □ 

Thus, true atoms appear in all repaired databases whereas undefined atoms 
appear in a proper subset of repaired databases. Given a database D and a set of 
integrity constraints XC, the application of XC to D, denoted by XC{D), defines 
three distinct sets of atoms: the set of true atoms IC(D)+, the set of undefined 
atoms IC(D)“ and the set of false atoms XC{D)~ . 

Definition 8. Given a database D and a query Q = the consistent 

answer of the query Q on the database D, denoted as Q{D,XC), gives three sets, 
denoted as Q{D,XC)^ , Q{D,XC)~ and Q{D,XCY . These contain, respectively, 
the sets of ^-tuples which are true (i.e. belonging to Q{D') for all repaired 
databases D'), false (i.e. not belonging to Q{D') for all repaired databases D') 
and undefined (i.e. set of tuples which are neither true nor false). □ 




354 G. Greco, S. Greco, and E. Zumpano 



4 A Logic Programming Approach 

The aim of this section is to describe how the integration process (i.e. database 
merging, specifying integrity constraints, repairing and querying inconsistent 
databases) can be modeled by means of a logic program. In particular, we show 
that the merging of databases can be performed by means of a (stratified) Data- 
log program and that the computation of repairs and consistent answers can be 
thought of as a set of rules for integrating some sources; moreover, we also show 
that every disjunctive program can be carried out by rewriting constraints into 
logic rules. 



4.1 Merging Databases 



We start by introducing two merging operators and a function which given two 
relations R and S, replaces null values appearing in the tuple of R with values 
of the related tuples in S. 

In more detail, given two relations R and S such that attr{R) C attr{S), the 
operator 0 is defined as follows: 



0{R, S) = {t £ R \ jBti £ S s.t. key(t) = key{ti) } U {t | 3ti G R, 3t2 G S s.t. 



Va G attr{R) ( t[a] 



/ t2[a] 

\ b H 



if (a G attr{S) A ti[a] = _L) . 
otherwise ^ 



The Merge Operator. Given two overlapping relations R and S, the merge op- 
erator, denoted by Kl, integrates the information provided by R and S. Let 
T = RM S, then the schema of T contains the union of the attributes in R and 
S, and its instance is obtained by completing the information coming from each 
input relation with that coming from the other one. 

Definition 9. Let R and S be two overlapping relations. The merge operator 
is a binary operator defined as follows: 

RM S = 0{RZ>^ S,S)yjO{Rt^ S,R) □ 

RMS computes the full outer join and extends tuples coming from R (resp. S) 
with the values of tuples of S (resp. R) having the same key. The extension of a 
tuple is carried out by the operator 0 which replaces null values appearing in a 
given tuple of the first relation with values appearing in some correlated tuple 
of the second relation. Thus, the merge operator applied to two relations R and 
S ‘extends’ the content of tuples of both R and S. 

Example 3. Consider the relations R and S with schemata {K, Title, Author) 
and {K, Title, Author, Year) where K is the key of both relations. Assuming 
that the instances of R and S consist of the following facts 

i?(l. Moon, Greg) S'(3, Flowers, Smith, 1965) 

i?(2. Money, Jones) S'(4, Sea, Taylor, 1971) 

i?(3. Sky, Jones) S(5, Sun, Steven, 1980) 
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the relation T = RM S is as follows: 

T(l, Moon, Greg, _L) T(3, Flowers, Smith, 1965) 

T(2, Money, Jones, _L) T(4, Sea, Taylor, 1971) 

T{3, Sky, Jones, 1965) T{5, Sun, Steven, 1980) □ 

The Prioritized Merge Operator. In order to satisfy preference constraints, we in- 
troduce an asymmetric merge operator, called prioritized merge operator, which 
gives preference to data coming from the left relation when conflicting tuples are 
detected. 

Definition 10. Let R and S be two overlapping relations and let 
S' = Sx {T^key(S)S — T^key(R)R) be the Set of tuples in S not joining with any 
tuple in R. The prioritized merge operator is defined as follows: 

R<jS = 0{R-M<iS,S)U{Rt<zS') □ 

The prioritized merge operator includes all tuples of the left relation and only 
the tuples of the right relation whose key does not identify any tuple in the left 
relation. Moreover, only tuples ‘coming’ from the left relation are extended since 
tuples coming from the right relation, which join tuples coming from the left 
relation, are not included. Thus, when integrating relations conflicting on the 
key attributes, the prioritized merge operator gives preference to the tuples of 
the left side relation and completes them with values taken from the right side 
relation. 

Example 4- Consider the relations R and S of Example 3. The relation T = R<lS 
is as follows: 

T(l, Moon, Greg, T) T(4, Sea, Taylor, 1971) 

T(2, Money, Jones, T) T(5, Sun, Steven, 1980) 

T {3, Sky, Jones, 1965) 

The above two merge operators can be easily defined by means of a (non 
recursive) Datalog program. 

Definition 11. Let R and S be two relations and let K = key{R) = key{S), 
A = attr{R) fl attr{S), B = attr{R) — attr{S) and G = attr{S) — attr{R) be 
sets of attributes. The integrated relation P = RMS is defined by the following 
program 

p{K,A,B,C)^r{K,A,B), s{K,A,C) 

p{K,A,B,C)^r{K,A',B), not s[K, A' ,C), s[K,A",C), extend) A' , A" , A) 
p{K,A,B,C) s)K,A',C), notr{K,A',B'), r{K,A",B), extend)A' , A" , A) 

where 

extend))], [], []) 

extend)[A\Li],[B\L 2 ],[C\L 3 ]) max)A,B,C), ea;tend(Li, L 2 , Ls) 
max)A, B,A) ^4 

max)A, B, B) ■(— A =T LI 
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Also the prioritized merge operation, introduced in Definition 10 can be easily 
expressed by means of a logic program. 

Definition 12. Let R and S be two relations and let K = key{R) = key{S), 
A = attr{R) fl attr{S), B = attr{R) — attr{S) and C = attr{S) — attr{R) be 
sets of attributes. The integrated relation P = R<\S \s defined by the following 
program 

p{K,A,B,C) -i- r{K,A,B), s{K,A,C) 

p{K,A,B,C)^r{K,A',B), not s{K, A' ,C), s{K,A",C), extend{A' , A” , A) 
p{K,A,±,C) -i- sIk,A,C), notr{K,A,B) □ 

Thus, in the following we assume that the merging of two databases Di and 
D 2 is done by a stratified Datalog program MV (merging program) which ap- 
plied to Di and gives a new database D {MV{Di, D 2 ) = D) consisting of 
a subset of the facts inferred by means of the merging program {D is a sub- 
set of SM{MV OxuD^))- The use of a (stratified) logic program for integrating 
databases makes the merging process more flexible with respect to the use of 
predefined operators such as the ones introduced here and the others defined in 
the literature [23]. 

Example 5. Consider the two relations R and S denoting oriented graphs. The 
following program P ‘merges’ the two relations by deleting tuples which can be 
inferred, i.e. it computes the relation T = RU S — ttk f, s.t{R ixIr.t = s.f S). 

rs{F, T) ^ r{F, T) t'{F, T) ^ rs{F, I), rs{I, T) 

rs{F, T) ^ s{F, T) t{F, T) ^ rs{F, T), not t'{F, T) 

The output database consists of the t-tuples computed by P, that is, the tuples 
computed by applying P to R and S. 

The following program merges the two relations R and S by computing their 
union and replacing null values with values taken from the closure of the graph: 

c(P, T) ^ rs{F, T), T A A tc{F, T) ^ rs{F, T) 

c{F, T) ^ rs(F, T), tc{F, T) tc{F, T) ^ rs{F, I), tc{I, T) □ 

Fact 1 The merging of two databases D\ and D 2 can he done in polynomial 

time in the size of Di and D 2 - □ 

4.2 Managing Inconsistent Databases 

Since the merged database could be inconsistent, we present a technique which 
permits us to compute repairs and consistent answers for possibly inconsistent 
databases. The technique is based on the generation of a disjunctive program 
'DV{IC) derived from the set of integrity constraints IC. The repairs for the 
database can be generated from the stable models of 'DV{IC) whereas the com- 
putation of the consistent answers of a query {g, V) can be derived by considering 
the stable models of the program V U W{XC) over the database D. 
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Definition 13. Let c be a universally quantified constraint of the form 

{'iX)[Bi A ... A A D V • • • V ] 

where Bi, Bn, Ai, An are positive atoms, then, dj{c) denotes the rule 

- B[ V ... V - V A[ V ... V A'n^ ^ (Bi V B [), ..., (B„ V B'n),if, 

{not Ai V -^A'i),...,{notAm V -•A'n^) 

where B{{A{) denotes the atom derived from Bi{Ai), by replacing the predicate 
symbol p with the new symbol pd- Let IC be a set of universally quantified 
integrity constraints, then T>'P{IC) = { dj{c) | c G IC }. □ 

Thus, T>'P{IC) denotes the set of generalized disjunctive rules derived from 
the rewriting of IC, 'DV{IC) d denotes the program derived from the union of the 
rules in VV{IC) with the facts in D and SM.{VV{IC) d) (resp. MM.{VV{IC)d)) 
denotes the set of stable (resp. minimal) models of 'DV{IC)d- Recall that every 
stable model is consistent, according to the definition of consistent set given in 
Section 2, since it cannot contain two atoms of the form A and -■ A. 

Example 6. Consider the integrated relation T of Example 3. The functional 
dependency K — >• {Title, Author, Year) stating that K is a, key for the relation 
can be rewritten as a first order formula r: 

y{X,Y,Z,W,Y' ,Z' ,W')[T{X,Y,Z,W) AT{X,Y' ,Z' ,W) ^Y = Y',Z = Z',W = W] 

The associated disjunctive rule dj{r) is 

^Td{X, Y, Z, W) V -^Td{X, Y', Z', W) ^ {T{X, Y, Z, W) V Td{X, Y, Z, IT)), 

{T{X, X', Y', Z') V Td{X, X', Y', Z')), 
not{Y = Y',Z = Z', W = IT') 

Observe that a (generalized) extended disjunctive Datalog program can be sim- 
plified by eliminating from the body rules all literals whose predicate symbols 
are derived and do not appear in the head of any rule (these literals cannot be 
true). Thus the rule can be simplified as 

-^Td{X, Y Z, IT) V -^Td{X, Y', Z’, IT') ^ T{X, Y, Z, IT), T{X, X' , Y' , Z'), 

not{Y = Y',Z = Z', IT = IT') 

since the predicate Td does not appear in the head of any rule and for this 
reason cannot be inferred by such a program. The above program has two stable 
models M\ = DU {->Td{3, Sky, Jones, 1965)} and M 2 = DU {~<Td{3, Flowers, 
Smith, 1965)}. □ 



Definition 14. A set of integrity constraints IC is said to be acyclic if, by 
rewriting all constraints in IC as denials, every base atom appears either positive 
or negative in all rules. □ 




358 G. Greco, S. Greco, and E. Zumpano 



4.3 Computing Database Repairs 

Every stable model can be used to define a possible repair for the database 
by interpreting new derived atoms (denoted by the subscript “d”) as insertions 
and deletions of tuples. Thus, if a stable model M contains two atoms 
(derived atom) and p{t) (base atom) we deduce that the atom p{t) violates some 
constraint and, therefore, it must be deleted. Analogously, if M contains the 
derived atoms Pd{t) and does not contain p{t) (i.e. p{t) is not in the database) 
we deduce that the atom p{t) should be inserted in the database. 

Definition 15. Given a database D and a set of integrity constraint IC over D 
and letting M be a stable model of W(IC)d, then, TZ{M) = ( {p{t) \ Pd{t) G 
M A p{t) ^ D}, {p{t) I ->pd(t) G M A p{t) G D} ). □ 



Theorem 2. Given a database D and a set of integrity constraints IC on D, then 

1. (Soundness) for every stable model M ofW{IC) o, 'R-{M) is a repair for D; 

2. ( Completeness) for every database repair S for D there exists a stable model 

M for VV{IC) D such that S = TZ{M). □ 



Example 7. Consider the database of Example 2. The rewriting of the integrity 
constraint (VX)[ p{X) D q{X) ] produces the disjunctive rule 

r : ~^Pd{X) V qd{X) G- {p{X) V Pd{X)), {not q{X) V -.^^(A)) 
which can be rewritten in the simpler form 

r' : ~^Pd{X) V qd{X) G- p{X), not qd{X) 

since the predicates pd and -•qd do not appear in the head of any rule. The 
program Pd, where P is the program consisting of the disjunctive rule r' and D 
is the input database, has two stable models Mi = DU { -•Pd{b)} and M 2 = DU 
{ qd{b)}- The derived repairs are TZ{Mi) = ({g(5)},0) and TZ{M 2 ) = (0,{p(&)}) 
corresponding, respectively, to the insertion of q{b) and to the deletion of p{b). □ 



4.4 Computing Consistent Answers 

Let M he & stable model of VV{XC)d and TZ{M) the associated repair for the 
database D-, then TZ{M, D) denotes the repaired database obtained by means of 
the deletions and insertions specified in M. 

The consistent answer for the query Q = {g, V) over the database D under 
constraints IC is as follows: 

Q{D,TC)+ = { g{t)\ VM e SM{{W{TC))d) is g{t) G MM{{V UVV{TC))t^^m,d))} 
Q{D,IC)~ = { g{t)\ fiM G SM{{VV{TC))d) s.t. g{t) G MM{{VuVV{TC))n(M,D))} 
Q(D,2:C)“ = { g{t)\ 3Mi,M2 € SM{{VV{TC))d) s.t. 

g{t) G MM{{P UVP{IC))n(M,,D))), g{t) ^ MM{{P UVr{IC))n(M„D))} 
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For instance, in Example 7, the set of true tuples are those belonging to the 
intersection of the two models, that is p{a), q{a) and q{c), whereas the set of 
undefined tuples are those belonging to the union of the two models and not 
belonging to their intersection, that is p{b) and q{b). 

Note that for every database D, query Q = {g,V) and repaired database D'-. 

1. Each atom A € Q{D,XC)^ belongs to the stable model of Vd' (soundness) 

2. Each atom A G Q(D,ZC)~ does not belong to the stable model of Pb' 
(completeness). 

Example 8. Consider the database of Example 1. The functional dependency 
defined by the key of relation Teaches can be defined as 

(VX Vr VZ) [ Teaches{X, Y) A Teaches{X, Z) D Y = Z ] 

The corresponding disjunctive program P consists of the rule 

-•T caches d{X,Y) V ~'T caches d{X, Z) G- Teaches{X,Y) ,Teaches{X^ Z) ,Y ^Z 

The program Pp has two stable models: Mi = D U {-•T caches d{c2,P2)} and 
M2 = DA {-^T caches d{c2TPs)}- The answer to the query {Teaches, 0) gives the 
tuple (ci,pi) as true and the tuples {c2,P2) and {c2,ps) as undefined. □ 

5 Repair Constraints 

In this section we introduce repair constraints which permit us to restrict the 
number of repairs. These constraints can be defined during the integration phase 
to give preference to certain data with respect to others. 

Definition 16. A repair constraint is a denial rule of the form 

^ upi{Ai),...,upk{Ak),Li,...,Ln 

where upi, ...,upk G {insert, delete}, Ai,...,Ak are atoms and Li,...,L„ are 
standard literals. □ 

Informally, the semantics of a repair constraint is as follows: if Li, ...,L„ is 
true in the repaired database then at least one of the updates upi{Ai) must be 
false. 

Definition 17. Let D be a database, XC a set of integrity constraints and PC a 
set of repair constraints. We say a repair R for D satisfies PC (written {R, D) ^ 
PC) if for each ^ insert{Ai) , ..., insert{Ak) , delete{Bi) , ..., delete{Bh), Li, ..., Ln 
in PC then i) there is some Ai false in or ii) there is some Bi true in R~ or 
iii) there is some Li false in R{D). □ 

Given a database D, a set of integrity constraints XC and a set of repair 
constraints PC over D, a repair R for the database D is said to be feasible if all 
repair constraints in PC are satisfied. Moreover, a repaired database R{D) is said 
to be feasible if R is feasible. Clearly, feasible repaired databases are consistent. 
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Example 9. Consider the database D = {e(Peter, 30000), e( Jo/m, 40000), 
e( Jo/m, 50000)} containing information about names and salaries of employ- 
ees, and the integrity constraint (VX,Y, Z)[e(X,Y) A e{X,Z) Zi Y = Z], 
There are two repairs for such a database: Ri = (0, {e( Jo/m, 40000)}) and 
R 2 = (0, {e( Jo/m, 50000)}) producing, respectively, the repaired databases Di = 
{e(Pe/er, 30000), e( Jo/m, 50000)} and D 2 = {e(Pe/er, 30000), e{John, 40000)}. 
The repair constraint 

^ delete{e{X,S)), e{X,S'), S' > S 

states that if the same employee occurs with more than one salary we cannot 
delete the tuple with the lowest salary. Thus, the repair Ri is not feasible since 
it deletes the tuple e( Jo/m, 40000) but the repaired database R\{D) contains 
the tuple e( Jo/m, 50000). □ 

Definition 18. Given a database D, a set of integrity constraints XC and a set 
of repair constraints RC over D, an atom A is true (resp. false) with respect 
to {D,XC,TZC) if A belongs to all feasible repaired databases (resp. there is no 
feasible repaired database containing A). The set of atoms which are neither 
true nor false is undefined. □ 

Clearly, for an empty set of repair constraints Definition 7 and Definition 18 
coincide. The formal semantics of databases with both integrity and repair con- 
straints is given by rewriting the repair constraints into (generalized) extended 
disjunctive rules with empty heads (denials). In particular, the sets of integrity 
constraints XC and repair constraints TZC are rewritten into an extended disjunc- 
tive program VV{XC, TZC). Each stable model of VV{XC, TZC) over a database D 
can be used to generate a repair for the database, whereas each stable model of 
the program P U T>V{XC, TZC), over the database D, can be used to compute a 
consistent answer to a query (g,P). Each model defines a set of actions (update 
operations) on the inconsistent database to achieve a consistent state. 

Definition 19. Let r be a repair constraint of the form 

insert(Ai ) , ..., insert(Ak), delete{Ak+i , )..., delete{Am), Bi, Bi, 
not ..., not Bn, ‘P 

where Ai , ..., Am, B \, ..., Bn are atoms. Then, dj{r) denotes the denial rule 

^ A'l, ..., A^., ^Af,j^.^, ...,^A'm, 

((Bi,no/-'S}) V B}), ..., {{Bi,not^B'i)y B'l), 

{{not Bi+i,notB'ij^.^)W ^B'lj^.^), ..., {{not Bn,not B'„) V -•B'n), (p 

where A'^{B'^) is derived from Ai{Bi) by replacing the predicate symbol, say p, 
with pd and (/? is a conjunction of built-in atoms. Let TZC be a set of repair 
constraints, then V'P{TZC) = { dj{r) \ r G TZC }. Moreover, T>V{XC,TZC) denotes 
the set VV{XC) \JVV{TZC). □ 
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Example 10 . Consider the database of Example 9. The repair constraint 
^ delete{e{X,S)), e{X,S'), S' > S 
is rewritten as 

^ ^ed{X, S), {{{e{X, S'), not - 6 d{X, S')) V ed{X, S')), S' > S □ 

Given a database D, a set of integrity constraints IC and a set of repair 
constraints TZC over D, then for each stable model M of VV{TC, TZC) d, 7^(M) de- 
notes the following pair: TZ{M) = ( {p{t) I Pd(t) G M Ap{t) ^ D}, {p(t) I ~^Pd{t) G 
M A p{t) G D} ). 

Lemma 1. Let IC he a set of integrity constraints and TZC a set of repair con- 
straints over a database D, then M is a stable model for T>'P{IC,TZC)d if and 
only if it is a stable model for 'D'P{IC,TZC)d satisfying VT’ifJZC). □ 

The following theorem states the relation between feasible repairs and stable 
models. 

Theorem 3. Given a database D, a set of integrity constraints IC and a set of 
repair constraints TZC over D, then 

1 . (Soundness) for every stable model M ofVT’{IC,TZC)D, TZ{M) is a feasible 
repair for D; 

2 . ( Completeness ) for every feasible database repair R for D there exists a stable 

model M for VVilC, TZC) d such that R = TZ{M). □ 

An update constraint is a repair constraint of the form ^ up{p{t\, ...,tn)) 
where up G {delete, insert} and t\,...,tn are terms (constants or variables). 
Update constraints define the type of update operations allowed on the database. 

6 Prioritized Repairs 

In this section we extend our framework by considering prioritized updates. 
These rules can be useful when the database is derived from the integration of 
different databases since they can be used to take into account the origin of data. 

Definition 20. A prioritized update rule is a rule of the form upi{A) ^ up2{B) 
where upi,up2 G {insert, delete}, and A and B are atoms. □ 

For any two update atoms a\ and 02, if oi ^ 02 then we say that 02 has 
higher priority than ai. Moreover, oi ^ 02 if ai ^ 02 and oi yf 02. A priority 
statement a\ ^ 02 means that for each a'l instance of oi and for each 02 instance 
of 02 is o']^ ^ 02. Clearly, the sets of ground instantiations of Oi and 02 must 
have empty intersections. In the following we shall denote with 'PC the set of 
prioritized update rules and with PC* the reflexive, transitive closure of PC. 

The relation C is defined over the feasible repairs of I? as follows. For any 
repairs i?i,i?2 and R^oi D, 
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1 . Ri C Ri, 

2 . Ri C i?2 if a) 3 c2 G i?2 — ^i, 3 ci G Ri — R2 such that (ei ^ 62) G VC* and 
b) ;9e3 G i?i — i?2 such that (c2 ^ 63) G VC* , 

3 . if i?i G i?2 and R2 G i?3, then i?i G R^,. 

If Ri G i?2 we say that R2 is preferable to i?i. Moreover, we write i?i G i?2 if 
i?i G i?2 and i?i i?2- A set of updates i? is a preferred repair for Z? if i? is a 
repair for D and there is no repair R' for D which is preferable to R. 

Definition 21 . Let D be a database, IC a set of full integrity constraints, 
TZC a set of repair constraints and VC a set of prioritized update rules. Then, 
T>V{IC,TZC,VC) denotes the prioritized generalized extended disjunctive pro- 
gram {DP{IC,'RC), < 1 >(VC)) where d>{VC) = {A[ ^ A'2\upi{Ai) ^ ^^2(^2)} and 
A' is derived from upi(Ai) as follows: 

1. A'^= pd{ti,..,tn) Aupi(A^) = insert{p{ti,..,tn)), 

2. A'i = ^pd{ti,..,tn) A upi(Ai) = delete{p{ti,..,tn))- G 

Thus, the new disjunctive program is obtained by rewriting integrity con- 
straints into generalized extended disjunctive rules, repair constraints into gen- 
eralized extended denials and then adding prioritized rules obtained from the 
rewriting of prioritized update rules. In the following, VV{XC,TZC,VC) denotes 
the pair {VV{IC,TZC),<P{VC)). 

Theorem 4 . Let D he a database, IC a set of full integrity constraints, TZC a 
set of repair constraints and VC a set of prioritized update rules, then 

1 . (Soundness) for every preferred stable model M of'DV{XC, TZC, VC) d, TZ{M) 
is a preferred repair for D; 

2 . (Completeness) for every preferred database repair S for D there exists a 
preferred stable model M for T>V(XC,TZC,VC)d such that S = TZ{M). □ 

Prioritized update rules can be useful in the integration of databases to take 
into account the origin of data. For instance, when we want to give preference 
(credit) to the data coming from one source with respect to those coming from 
another. 

Each atom is of the form P[s](ti, ■■■,tn) where p is a predicate symbol and s 
is a source database identifier. Moreover, we denote with * a generic database 
and p[,] (ti, ..., f„) is also denoted by p(ti,...,t„) (i.e. the origin of data is not 
important). Inside repair and prioritized constraints delete operations may refer 
to the origin of data. 

In this case each tuple p(ti, ■■■,tn) coming from a source s is stored asp(ti, ...,tn,s) 
and the constraints are rewritten accordingly. 

7 Conclusions 

The proposed technique is sound and complete for universally quantified con- 
straints whereas previously defined techniques only consider restricted cases. In 
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the general case, our technique is quite expensive (checking if a fact belongs to 
the consistent answer of a query Q is bounded in the second level of the poly- 
nomial hierarchy (i7|^)), but there are significant classes of constraints such as 
functional and inclusion dependencies which can be computed efficiently; in fact, 
for these classes of constraints, computing nondeterministically a repair can be 
done in polynomial time and computing the consistent answer is polynomial if 
the program appearing in the query is empty [10]. The introduction of repair 
constraints and prioritized update rules makes possible to restrict the set of feasi- 
ble repairs; clearly, in the general case, the complexity increases since checking if 
there exists a preferred repair (resp. for each preferred repair) for a database D, 
such that the answer to a query Q is not empty, is complete for the third level of 
the polynomial hierarchy complete and complete, respectively) [10]. 
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